ray init kwargs: {'num_cpus': None, 'runtime_env': {'env_vars': {'TOKENIZERS_PARALLELISM': 'true', 'NCCL_DEBUG': 'WARN', 'VLLM_LOGGING_LEVEL': 'WARN', 'VLLM_ALLOW_RUNTIME_LORA_UPDATING': 'true', 'CUDA_DEVICE_MAX_CONNECTIONS': '1', 'NCCL_CUMEM_ENABLE': '0', 'VLLM_DISABLE_COMPILE_CACHE': '1', 'HCCL_HOST_SOCKET_PORT_RANGE': 'auto', 'HCCL_NPU_SOCKET_PORT_RANGE': 'auto'}, 'working_dir': None}}
2026-04-12 11:27:02,205	WARNING services.py:2168 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 8360747008 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=10.24gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
2026-04-12 11:27:05,380	INFO worker.py:2004 -- Started a local Ray instance. View the dashboard at [1m[32m127.0.0.1:8265 [39m[22m
/storage/workspace/server-5/rl/miniconda3/envs/verl/lib/python3.10/site-packages/ray/_private/worker.py:2052: FutureWarning: Tip: In future versions of Ray, Ray will no longer override accelerator visible devices env var if num_gpus=0 or num_gpus=None (default). To enable this behavior and turn off this error message, set RAY_ACCEL_ENV_VAR_OVERRIDE_ON_ZERO=0
  warnings.warn(
[36m(TaskRunner pid=2823680)[0m TaskRunner hostname: gpu-ssh-server-5, PID: 2823680
[36m(TaskRunner pid=2823680)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/trainer/main_ppo.py:302: UserWarning: Disabled critic as algorithm.adv_estimator != gae. If it is not intended, please set critic.enable=True
[36m(TaskRunner pid=2823680)[0m   use_critic=need_critic(config),
[36m(TaskRunner pid=2823680)[0m {'actor_rollout_ref': {'actor': {'_target_': 'verl.workers.config.FSDPActorConfig',
[36m(TaskRunner pid=2823680)[0m                                  'calculate_entropy': True,
[36m(TaskRunner pid=2823680)[0m                                  'calculate_sum_pi_squared': False,
[36m(TaskRunner pid=2823680)[0m                                  'checkpoint': {'_target_': 'verl.trainer.config.CheckpointConfig',
[36m(TaskRunner pid=2823680)[0m                                                 'async_save': False,
[36m(TaskRunner pid=2823680)[0m                                                 'load_contents': ['model',
[36m(TaskRunner pid=2823680)[0m                                                                   'optimizer',
[36m(TaskRunner pid=2823680)[0m                                                                   'extra'],
[36m(TaskRunner pid=2823680)[0m                                                 'mbridge_config': {},
[36m(TaskRunner pid=2823680)[0m                                                 'save_contents': ['model',
[36m(TaskRunner pid=2823680)[0m                                                                   'optimizer',
[36m(TaskRunner pid=2823680)[0m                                                                   'extra']},
[36m(TaskRunner pid=2823680)[0m                                  'clip_ratio': 0.2,
[36m(TaskRunner pid=2823680)[0m                                  'clip_ratio_c': 3.0,
[36m(TaskRunner pid=2823680)[0m                                  'clip_ratio_high': 0.2,
[36m(TaskRunner pid=2823680)[0m                                  'clip_ratio_low': 0.2,
[36m(TaskRunner pid=2823680)[0m                                  'data_loader_seed': 42,
[36m(TaskRunner pid=2823680)[0m                                  'entropy_checkpointing': False,
[36m(TaskRunner pid=2823680)[0m                                  'entropy_coeff': 0,
[36m(TaskRunner pid=2823680)[0m                                  'entropy_from_logits_with_chunking': False,
[36m(TaskRunner pid=2823680)[0m                                  'freeze_vision_tower': False,
[36m(TaskRunner pid=2823680)[0m                                  'fsdp_config': {'_target_': 'verl.workers.config.FSDPEngineConfig',
[36m(TaskRunner pid=2823680)[0m                                                  'dtype': 'bfloat16',
[36m(TaskRunner pid=2823680)[0m                                                  'entropy_checkpointing': False,
[36m(TaskRunner pid=2823680)[0m                                                  'entropy_from_logits_with_chunking': False,
[36m(TaskRunner pid=2823680)[0m                                                  'forward_only': False,
[36m(TaskRunner pid=2823680)[0m                                                  'forward_prefetch': False,
[36m(TaskRunner pid=2823680)[0m                                                  'fsdp_size': -1,
[36m(TaskRunner pid=2823680)[0m                                                  'full_determinism': False,
[36m(TaskRunner pid=2823680)[0m                                                  'model_dtype': 'fp32',
[36m(TaskRunner pid=2823680)[0m                                                  'offload_policy': False,
[36m(TaskRunner pid=2823680)[0m                                                  'optimizer_offload': False,
[36m(TaskRunner pid=2823680)[0m                                                  'param_offload': False,
[36m(TaskRunner pid=2823680)[0m                                                  'qat': {'_target_': 'verl.workers.config.QATEngineConfig',
[36m(TaskRunner pid=2823680)[0m                                                          'activation_observer': 'static_minmax',
[36m(TaskRunner pid=2823680)[0m                                                          'enable': False,
[36m(TaskRunner pid=2823680)[0m                                                          'group_size': 16,
[36m(TaskRunner pid=2823680)[0m                                                          'ignore_patterns': ['lm_head',
[36m(TaskRunner pid=2823680)[0m                                                                              'embed_tokens',
[36m(TaskRunner pid=2823680)[0m                                                                              're:.*mlp.gate$'],
[36m(TaskRunner pid=2823680)[0m                                                          'mode': 'w4a16',
[36m(TaskRunner pid=2823680)[0m                                                          'quantization_config_path': None},
[36m(TaskRunner pid=2823680)[0m                                                  'reshard_after_forward': True,
[36m(TaskRunner pid=2823680)[0m                                                  'seed': 42,
[36m(TaskRunner pid=2823680)[0m                                                  'strategy': 'fsdp',
[36m(TaskRunner pid=2823680)[0m                                                  'ulysses_sequence_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                                  'use_orig_params': False,
[36m(TaskRunner pid=2823680)[0m                                                  'use_torch_compile': True,
[36m(TaskRunner pid=2823680)[0m                                                  'wrap_policy': {'min_num_params': 0}},
[36m(TaskRunner pid=2823680)[0m                                  'grad_clip': 1.0,
[36m(TaskRunner pid=2823680)[0m                                  'kl_loss_coef': 0.0,
[36m(TaskRunner pid=2823680)[0m                                  'kl_loss_type': 'low_var_kl',
[36m(TaskRunner pid=2823680)[0m                                  'loss_agg_mode': 'token-mean',
[36m(TaskRunner pid=2823680)[0m                                  'loss_scale_factor': None,
[36m(TaskRunner pid=2823680)[0m                                  'optim': {'_target_': 'verl.workers.config.FSDPOptimizerConfig',
[36m(TaskRunner pid=2823680)[0m                                            'betas': [0.9, 0.999],
[36m(TaskRunner pid=2823680)[0m                                            'clip_grad': 1.0,
[36m(TaskRunner pid=2823680)[0m                                            'lr': 1e-06,
[36m(TaskRunner pid=2823680)[0m                                            'lr_scheduler_type': 'constant',
[36m(TaskRunner pid=2823680)[0m                                            'lr_warmup_steps': -1,
[36m(TaskRunner pid=2823680)[0m                                            'lr_warmup_steps_ratio': 0.0,
[36m(TaskRunner pid=2823680)[0m                                            'min_lr_ratio': 0.0,
[36m(TaskRunner pid=2823680)[0m                                            'num_cycles': 0.5,
[36m(TaskRunner pid=2823680)[0m                                            'optimizer': 'AdamW',
[36m(TaskRunner pid=2823680)[0m                                            'optimizer_impl': 'torch.optim',
[36m(TaskRunner pid=2823680)[0m                                            'override_optimizer_config': None,
[36m(TaskRunner pid=2823680)[0m                                            'total_training_steps': -1,
[36m(TaskRunner pid=2823680)[0m                                            'warmup_style': None,
[36m(TaskRunner pid=2823680)[0m                                            'weight_decay': 0.01,
[36m(TaskRunner pid=2823680)[0m                                            'zero_indexed_step': True},
[36m(TaskRunner pid=2823680)[0m                                  'policy_loss': {'_target_': 'verl.workers.config.PolicyLossConfig',
[36m(TaskRunner pid=2823680)[0m                                                  'clip_cov_lb': 1.0,
[36m(TaskRunner pid=2823680)[0m                                                  'clip_cov_ratio': 0.0002,
[36m(TaskRunner pid=2823680)[0m                                                  'clip_cov_ub': 5.0,
[36m(TaskRunner pid=2823680)[0m                                                  'kl_cov_ratio': 0.0002,
[36m(TaskRunner pid=2823680)[0m                                                  'loss_mode': 'vanilla',
[36m(TaskRunner pid=2823680)[0m                                                  'ppo_kl_coef': 0.1},
[36m(TaskRunner pid=2823680)[0m                                  'ppo_epochs': 1,
[36m(TaskRunner pid=2823680)[0m                                  'ppo_max_token_len_per_gpu': 10240,
[36m(TaskRunner pid=2823680)[0m                                  'ppo_micro_batch_size': None,
[36m(TaskRunner pid=2823680)[0m                                  'ppo_micro_batch_size_per_gpu': 4,
[36m(TaskRunner pid=2823680)[0m                                  'ppo_mini_batch_size': 8,
[36m(TaskRunner pid=2823680)[0m                                  'profiler': {'_target_': 'verl.utils.profiler.ProfilerConfig',
[36m(TaskRunner pid=2823680)[0m                                               'all_ranks': False,
[36m(TaskRunner pid=2823680)[0m                                               'enable': False,
[36m(TaskRunner pid=2823680)[0m                                               'ranks': [],
[36m(TaskRunner pid=2823680)[0m                                               'save_path': 'outputs/profile',
[36m(TaskRunner pid=2823680)[0m                                               'tool': None,
[36m(TaskRunner pid=2823680)[0m                                               'tool_config': {'npu': {'_target_': 'verl.utils.profiler.config.NPUToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                       'analysis': True,
[36m(TaskRunner pid=2823680)[0m                                                                       'contents': [],
[36m(TaskRunner pid=2823680)[0m                                                                       'discrete': False,
[36m(TaskRunner pid=2823680)[0m                                                                       'level': 'level0'},
[36m(TaskRunner pid=2823680)[0m                                                               'nsys': {'_target_': 'verl.utils.profiler.config.NsightToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                        'discrete': False},
[36m(TaskRunner pid=2823680)[0m                                                               'torch': {'_target_': 'verl.utils.profiler.config.TorchProfilerToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                         'contents': [],
[36m(TaskRunner pid=2823680)[0m                                                                         'discrete': False},
[36m(TaskRunner pid=2823680)[0m                                                               'torch_memory': {'_target_': 'verl.utils.profiler.config.TorchMemoryToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                                'stack_depth': 32,
[36m(TaskRunner pid=2823680)[0m                                                                                'trace_alloc_max_entries': 100000}}},
[36m(TaskRunner pid=2823680)[0m                                  'qat': {'activation_observer': 'static_minmax',
[36m(TaskRunner pid=2823680)[0m                                          'enable': False,
[36m(TaskRunner pid=2823680)[0m                                          'group_size': 16,
[36m(TaskRunner pid=2823680)[0m                                          'ignore_patterns': ['lm_head',
[36m(TaskRunner pid=2823680)[0m                                                              'embed_tokens',
[36m(TaskRunner pid=2823680)[0m                                                              're:.*mlp.gate$'],
[36m(TaskRunner pid=2823680)[0m                                          'mode': 'w4a16',
[36m(TaskRunner pid=2823680)[0m                                          'quantization_config_path': None},
[36m(TaskRunner pid=2823680)[0m                                  'rollout_n': 8,
[36m(TaskRunner pid=2823680)[0m                                  'router_replay': {'_target_': 'verl.workers.config.RouterReplayConfig',
[36m(TaskRunner pid=2823680)[0m                                                    'mode': 'disabled',
[36m(TaskRunner pid=2823680)[0m                                                    'record_file': None,
[36m(TaskRunner pid=2823680)[0m                                                    'replay_file': None},
[36m(TaskRunner pid=2823680)[0m                                  'shuffle': False,
[36m(TaskRunner pid=2823680)[0m                                  'strategy': 'fsdp',
[36m(TaskRunner pid=2823680)[0m                                  'sum_pi_squared_checkpointing': False,
[36m(TaskRunner pid=2823680)[0m                                  'tau_neg': 1.05,
[36m(TaskRunner pid=2823680)[0m                                  'tau_pos': 1.0,
[36m(TaskRunner pid=2823680)[0m                                  'ulysses_sequence_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                  'use_dynamic_bsz': False,
[36m(TaskRunner pid=2823680)[0m                                  'use_fused_kernels': False,
[36m(TaskRunner pid=2823680)[0m                                  'use_kl_loss': False,
[36m(TaskRunner pid=2823680)[0m                                  'use_prefix_grouper': False,
[36m(TaskRunner pid=2823680)[0m                                  'use_remove_padding': True,
[36m(TaskRunner pid=2823680)[0m                                  'use_torch_compile': True},
[36m(TaskRunner pid=2823680)[0m                        'hybrid_engine': True,
[36m(TaskRunner pid=2823680)[0m                        'model': {'_target_': 'verl.workers.config.HFModelConfig',
[36m(TaskRunner pid=2823680)[0m                                  'custom_chat_template': None,
[36m(TaskRunner pid=2823680)[0m                                  'enable_activation_offload': False,
[36m(TaskRunner pid=2823680)[0m                                  'enable_gradient_checkpointing': True,
[36m(TaskRunner pid=2823680)[0m                                  'exclude_modules': None,
[36m(TaskRunner pid=2823680)[0m                                  'external_lib': None,
[36m(TaskRunner pid=2823680)[0m                                  'fused_kernel_options': {'impl_backend': 'torch'},
[36m(TaskRunner pid=2823680)[0m                                  'hf_config_path': None,
[36m(TaskRunner pid=2823680)[0m                                  'lora_adapter_path': None,
[36m(TaskRunner pid=2823680)[0m                                  'lora_alpha': 16,
[36m(TaskRunner pid=2823680)[0m                                  'lora_rank': 0,
[36m(TaskRunner pid=2823680)[0m                                  'mtp': {'_target_': 'verl.workers.config.MtpConfig',
[36m(TaskRunner pid=2823680)[0m                                          'detach_encoder': False,
[36m(TaskRunner pid=2823680)[0m                                          'enable': False,
[36m(TaskRunner pid=2823680)[0m                                          'enable_rollout': False,
[36m(TaskRunner pid=2823680)[0m                                          'enable_train': False,
[36m(TaskRunner pid=2823680)[0m                                          'method': 'mtp',
[36m(TaskRunner pid=2823680)[0m                                          'mtp_loss_scaling_factor': 0.1,
[36m(TaskRunner pid=2823680)[0m                                          'num_speculative_tokens': 1,
[36m(TaskRunner pid=2823680)[0m                                          'speculative_algorithm': 'EAGLE',
[36m(TaskRunner pid=2823680)[0m                                          'speculative_eagle_topk': 1,
[36m(TaskRunner pid=2823680)[0m                                          'speculative_num_draft_tokens': 4,
[36m(TaskRunner pid=2823680)[0m                                          'speculative_num_steps': 3},
[36m(TaskRunner pid=2823680)[0m                                  'override_config': {'attn_implementation': 'flash_attention_2'},
[36m(TaskRunner pid=2823680)[0m                                  'path': 'RoadQAQ/Qwen2.5-Math-1.5B-16k-think',
[36m(TaskRunner pid=2823680)[0m                                  'target_modules': 'all-linear',
[36m(TaskRunner pid=2823680)[0m                                  'tiled_mlp': {'enabled': False,
[36m(TaskRunner pid=2823680)[0m                                                'num_shards': 4},
[36m(TaskRunner pid=2823680)[0m                                  'tokenizer_path': None,
[36m(TaskRunner pid=2823680)[0m                                  'trust_remote_code': False,
[36m(TaskRunner pid=2823680)[0m                                  'use_fused_kernels': False,
[36m(TaskRunner pid=2823680)[0m                                  'use_liger': False,
[36m(TaskRunner pid=2823680)[0m                                  'use_remove_padding': True,
[36m(TaskRunner pid=2823680)[0m                                  'use_shm': False},
[36m(TaskRunner pid=2823680)[0m                        'nccl_timeout': 600,
[36m(TaskRunner pid=2823680)[0m                        'ref': {'_target_': 'verl.workers.config.FSDPActorConfig',
[36m(TaskRunner pid=2823680)[0m                                'entropy_checkpointing': False,
[36m(TaskRunner pid=2823680)[0m                                'entropy_from_logits_with_chunking': False,
[36m(TaskRunner pid=2823680)[0m                                'fsdp_config': {'_target_': 'verl.workers.config.FSDPEngineConfig',
[36m(TaskRunner pid=2823680)[0m                                                'dtype': 'bfloat16',
[36m(TaskRunner pid=2823680)[0m                                                'entropy_checkpointing': False,
[36m(TaskRunner pid=2823680)[0m                                                'entropy_from_logits_with_chunking': False,
[36m(TaskRunner pid=2823680)[0m                                                'forward_only': True,
[36m(TaskRunner pid=2823680)[0m                                                'forward_prefetch': False,
[36m(TaskRunner pid=2823680)[0m                                                'fsdp_size': -1,
[36m(TaskRunner pid=2823680)[0m                                                'full_determinism': False,
[36m(TaskRunner pid=2823680)[0m                                                'model_dtype': 'fp32',
[36m(TaskRunner pid=2823680)[0m                                                'offload_policy': False,
[36m(TaskRunner pid=2823680)[0m                                                'optimizer_offload': False,
[36m(TaskRunner pid=2823680)[0m                                                'param_offload': False,
[36m(TaskRunner pid=2823680)[0m                                                'qat': {'_target_': 'verl.workers.config.QATEngineConfig',
[36m(TaskRunner pid=2823680)[0m                                                        'activation_observer': 'static_minmax',
[36m(TaskRunner pid=2823680)[0m                                                        'enable': False,
[36m(TaskRunner pid=2823680)[0m                                                        'group_size': 16,
[36m(TaskRunner pid=2823680)[0m                                                        'ignore_patterns': ['lm_head',
[36m(TaskRunner pid=2823680)[0m                                                                            'embed_tokens',
[36m(TaskRunner pid=2823680)[0m                                                                            're:.*mlp.gate$'],
[36m(TaskRunner pid=2823680)[0m                                                        'mode': 'w4a16',
[36m(TaskRunner pid=2823680)[0m                                                        'quantization_config_path': None},
[36m(TaskRunner pid=2823680)[0m                                                'reshard_after_forward': True,
[36m(TaskRunner pid=2823680)[0m                                                'seed': 42,
[36m(TaskRunner pid=2823680)[0m                                                'strategy': 'fsdp',
[36m(TaskRunner pid=2823680)[0m                                                'ulysses_sequence_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                                'use_orig_params': False,
[36m(TaskRunner pid=2823680)[0m                                                'use_torch_compile': True,
[36m(TaskRunner pid=2823680)[0m                                                'wrap_policy': {'min_num_params': 0}},
[36m(TaskRunner pid=2823680)[0m                                'log_prob_max_token_len_per_gpu': 10240,
[36m(TaskRunner pid=2823680)[0m                                'log_prob_micro_batch_size': None,
[36m(TaskRunner pid=2823680)[0m                                'log_prob_micro_batch_size_per_gpu': 32,
[36m(TaskRunner pid=2823680)[0m                                'log_prob_use_dynamic_bsz': False,
[36m(TaskRunner pid=2823680)[0m                                'profiler': {'_target_': 'verl.utils.profiler.ProfilerConfig',
[36m(TaskRunner pid=2823680)[0m                                             'all_ranks': False,
[36m(TaskRunner pid=2823680)[0m                                             'enable': False,
[36m(TaskRunner pid=2823680)[0m                                             'ranks': [],
[36m(TaskRunner pid=2823680)[0m                                             'save_path': 'outputs/profile',
[36m(TaskRunner pid=2823680)[0m                                             'tool': None,
[36m(TaskRunner pid=2823680)[0m                                             'tool_config': {'npu': {'_target_': 'verl.utils.profiler.config.NPUToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                     'analysis': True,
[36m(TaskRunner pid=2823680)[0m                                                                     'contents': [],
[36m(TaskRunner pid=2823680)[0m                                                                     'discrete': False,
[36m(TaskRunner pid=2823680)[0m                                                                     'level': 'level0'},
[36m(TaskRunner pid=2823680)[0m                                                             'nsys': {'_target_': 'verl.utils.profiler.config.NsightToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                      'discrete': False},
[36m(TaskRunner pid=2823680)[0m                                                             'torch': {'_target_': 'verl.utils.profiler.config.TorchProfilerToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                       'contents': [],
[36m(TaskRunner pid=2823680)[0m                                                                       'discrete': False},
[36m(TaskRunner pid=2823680)[0m                                                             'torch_memory': {'_target_': 'verl.utils.profiler.config.TorchMemoryToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                              'stack_depth': 32,
[36m(TaskRunner pid=2823680)[0m                                                                              'trace_alloc_max_entries': 100000}}},
[36m(TaskRunner pid=2823680)[0m                                'rollout_n': 8,
[36m(TaskRunner pid=2823680)[0m                                'router_replay': {'_target_': 'verl.workers.config.RouterReplayConfig',
[36m(TaskRunner pid=2823680)[0m                                                  'mode': 'disabled',
[36m(TaskRunner pid=2823680)[0m                                                  'record_file': None,
[36m(TaskRunner pid=2823680)[0m                                                  'replay_file': None},
[36m(TaskRunner pid=2823680)[0m                                'strategy': 'fsdp',
[36m(TaskRunner pid=2823680)[0m                                'ulysses_sequence_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                'use_torch_compile': True},
[36m(TaskRunner pid=2823680)[0m                        'rollout': {'_target_': 'verl.workers.config.RolloutConfig',
[36m(TaskRunner pid=2823680)[0m                                    'agent': {'_target_': 'verl.workers.config.AgentLoopConfig',
[36m(TaskRunner pid=2823680)[0m                                              'agent_loop_config_path': None,
[36m(TaskRunner pid=2823680)[0m                                              'custom_async_server': {'_target_': 'verl.workers.config.CustomAsyncServerConfig',
[36m(TaskRunner pid=2823680)[0m                                                                      'name': None,
[36m(TaskRunner pid=2823680)[0m                                                                      'path': None},
[36m(TaskRunner pid=2823680)[0m                                              'default_agent_loop': 'single_turn_agent',
[36m(TaskRunner pid=2823680)[0m                                              'num_workers': 8},
[36m(TaskRunner pid=2823680)[0m                                    'calculate_log_probs': False,
[36m(TaskRunner pid=2823680)[0m                                    'checkpoint_engine': {'_target_': 'verl.workers.config.CheckpointEngineConfig',
[36m(TaskRunner pid=2823680)[0m                                                          'backend': 'naive',
[36m(TaskRunner pid=2823680)[0m                                                          'engine_kwargs': {},
[36m(TaskRunner pid=2823680)[0m                                                          'update_weights_bucket_megabytes': 4096},
[36m(TaskRunner pid=2823680)[0m                                    'cudagraph_capture_sizes': None,
[36m(TaskRunner pid=2823680)[0m                                    'data_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                    'disable_log_stats': True,
[36m(TaskRunner pid=2823680)[0m                                    'do_sample': True,
[36m(TaskRunner pid=2823680)[0m                                    'dtype': 'bfloat16',
[36m(TaskRunner pid=2823680)[0m                                    'enable_chunked_prefill': True,
[36m(TaskRunner pid=2823680)[0m                                    'enable_prefix_caching': True,
[36m(TaskRunner pid=2823680)[0m                                    'enable_rollout_routing_replay': False,
[36m(TaskRunner pid=2823680)[0m                                    'enforce_eager': False,
[36m(TaskRunner pid=2823680)[0m                                    'engine_kwargs': {'sglang': {},
[36m(TaskRunner pid=2823680)[0m                                                      'trtllm': {},
[36m(TaskRunner pid=2823680)[0m                                                      'vllm': {'distributed_executor_backend': 'uni'}},
[36m(TaskRunner pid=2823680)[0m                                    'expert_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                    'free_cache_engine': True,
[36m(TaskRunner pid=2823680)[0m                                    'gpu_memory_utilization': 0.6,
[36m(TaskRunner pid=2823680)[0m                                    'ignore_eos': False,
[36m(TaskRunner pid=2823680)[0m                                    'layered_summon': False,
[36m(TaskRunner pid=2823680)[0m                                    'load_format': 'dummy',
[36m(TaskRunner pid=2823680)[0m                                    'log_prob_max_token_len_per_gpu': 10240,
[36m(TaskRunner pid=2823680)[0m                                    'log_prob_micro_batch_size': None,
[36m(TaskRunner pid=2823680)[0m                                    'log_prob_micro_batch_size_per_gpu': 4,
[36m(TaskRunner pid=2823680)[0m                                    'log_prob_use_dynamic_bsz': False,
[36m(TaskRunner pid=2823680)[0m                                    'logprobs_mode': 'processed_logprobs',
[36m(TaskRunner pid=2823680)[0m                                    'max_model_len': 16384,
[36m(TaskRunner pid=2823680)[0m                                    'max_num_batched_tokens': 8192,
[36m(TaskRunner pid=2823680)[0m                                    'max_num_seqs': 1024,
[36m(TaskRunner pid=2823680)[0m                                    'mode': 'async',
[36m(TaskRunner pid=2823680)[0m                                    'mtp': {'_target_': 'verl.workers.config.MtpConfig',
[36m(TaskRunner pid=2823680)[0m                                            'detach_encoder': False,
[36m(TaskRunner pid=2823680)[0m                                            'enable': False,
[36m(TaskRunner pid=2823680)[0m                                            'enable_rollout': False,
[36m(TaskRunner pid=2823680)[0m                                            'enable_train': False,
[36m(TaskRunner pid=2823680)[0m                                            'method': 'mtp',
[36m(TaskRunner pid=2823680)[0m                                            'mtp_loss_scaling_factor': 0.1,
[36m(TaskRunner pid=2823680)[0m                                            'num_speculative_tokens': 1,
[36m(TaskRunner pid=2823680)[0m                                            'speculative_algorithm': 'EAGLE',
[36m(TaskRunner pid=2823680)[0m                                            'speculative_eagle_topk': 1,
[36m(TaskRunner pid=2823680)[0m                                            'speculative_num_draft_tokens': 4,
[36m(TaskRunner pid=2823680)[0m                                            'speculative_num_steps': 3},
[36m(TaskRunner pid=2823680)[0m                                    'multi_stage_wake_up': False,
[36m(TaskRunner pid=2823680)[0m                                    'multi_turn': {'_target_': 'verl.workers.config.MultiTurnConfig',
[36m(TaskRunner pid=2823680)[0m                                                   'enable': False,
[36m(TaskRunner pid=2823680)[0m                                                   'format': 'hermes',
[36m(TaskRunner pid=2823680)[0m                                                   'interaction_config_path': None,
[36m(TaskRunner pid=2823680)[0m                                                   'max_assistant_turns': None,
[36m(TaskRunner pid=2823680)[0m                                                   'max_parallel_calls': 1,
[36m(TaskRunner pid=2823680)[0m                                                   'max_tool_response_length': 256,
[36m(TaskRunner pid=2823680)[0m                                                   'max_user_turns': None,
[36m(TaskRunner pid=2823680)[0m                                                   'num_repeat_rollouts': None,
[36m(TaskRunner pid=2823680)[0m                                                   'tokenization_sanity_check_mode': 'strict',
[36m(TaskRunner pid=2823680)[0m                                                   'tool_config_path': None,
[36m(TaskRunner pid=2823680)[0m                                                   'tool_response_truncate_side': 'middle',
[36m(TaskRunner pid=2823680)[0m                                                   'use_inference_chat_template': False},
[36m(TaskRunner pid=2823680)[0m                                    'n': 8,
[36m(TaskRunner pid=2823680)[0m                                    'n_gpus_per_node': 4,
[36m(TaskRunner pid=2823680)[0m                                    'name': 'vllm',
[36m(TaskRunner pid=2823680)[0m                                    'nnodes': 0,
[36m(TaskRunner pid=2823680)[0m                                    'over_sample_rate': 0,
[36m(TaskRunner pid=2823680)[0m                                    'pipeline_model_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                    'profiler': {'_target_': 'verl.utils.profiler.ProfilerConfig',
[36m(TaskRunner pid=2823680)[0m                                                 'all_ranks': False,
[36m(TaskRunner pid=2823680)[0m                                                 'enable': False,
[36m(TaskRunner pid=2823680)[0m                                                 'ranks': [],
[36m(TaskRunner pid=2823680)[0m                                                 'save_path': 'outputs/profile',
[36m(TaskRunner pid=2823680)[0m                                                 'tool': None,
[36m(TaskRunner pid=2823680)[0m                                                 'tool_config': {'npu': {'_target_': 'verl.utils.profiler.config.NPUToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                         'analysis': True,
[36m(TaskRunner pid=2823680)[0m                                                                         'contents': [],
[36m(TaskRunner pid=2823680)[0m                                                                         'discrete': False,
[36m(TaskRunner pid=2823680)[0m                                                                         'level': 'level0'},
[36m(TaskRunner pid=2823680)[0m                                                                 'torch': {'_target_': 'verl.utils.profiler.config.TorchProfilerToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                                           'contents': [],
[36m(TaskRunner pid=2823680)[0m                                                                           'discrete': False}}},
[36m(TaskRunner pid=2823680)[0m                                    'prometheus': {'_target_': 'verl.workers.config.PrometheusConfig',
[36m(TaskRunner pid=2823680)[0m                                                   'enable': False,
[36m(TaskRunner pid=2823680)[0m                                                   'file': '/tmp/ray/session_latest/metrics/prometheus/prometheus.yml',
[36m(TaskRunner pid=2823680)[0m                                                   'port': 9090,
[36m(TaskRunner pid=2823680)[0m                                                   'served_model_name': 'RoadQAQ/Qwen2.5-Math-1.5B-16k-think'},
[36m(TaskRunner pid=2823680)[0m                                    'prompt_length': 2048,
[36m(TaskRunner pid=2823680)[0m                                    'qat': {'_target_': 'verl.workers.config.QATEngineConfig',
[36m(TaskRunner pid=2823680)[0m                                            'activation_observer': 'static_minmax',
[36m(TaskRunner pid=2823680)[0m                                            'enable': False,
[36m(TaskRunner pid=2823680)[0m                                            'group_size': 16,
[36m(TaskRunner pid=2823680)[0m                                            'ignore_patterns': ['lm_head',
[36m(TaskRunner pid=2823680)[0m                                                                'embed_tokens',
[36m(TaskRunner pid=2823680)[0m                                                                're:.*mlp.gate$'],
[36m(TaskRunner pid=2823680)[0m                                            'mode': 'w4a16',
[36m(TaskRunner pid=2823680)[0m                                            'quantization_config_path': None},
[36m(TaskRunner pid=2823680)[0m                                    'quantization': None,
[36m(TaskRunner pid=2823680)[0m                                    'quantization_config_file': None,
[36m(TaskRunner pid=2823680)[0m                                    'response_length': 8192,
[36m(TaskRunner pid=2823680)[0m                                    'scheduling_policy': 'fcfs',
[36m(TaskRunner pid=2823680)[0m                                    'skip_dump_dir': '/tmp/rollout_dump',
[36m(TaskRunner pid=2823680)[0m                                    'skip_rollout': False,
[36m(TaskRunner pid=2823680)[0m                                    'skip_tokenizer_init': True,
[36m(TaskRunner pid=2823680)[0m                                    'temperature': 1.0,
[36m(TaskRunner pid=2823680)[0m                                    'tensor_model_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                    'top_k': -1,
[36m(TaskRunner pid=2823680)[0m                                    'top_p': 1,
[36m(TaskRunner pid=2823680)[0m                                    'trace': {'_target_': 'verl.workers.config.TraceConfig',
[36m(TaskRunner pid=2823680)[0m                                              'backend': None,
[36m(TaskRunner pid=2823680)[0m                                              'experiment_name': 'efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633',
[36m(TaskRunner pid=2823680)[0m                                              'max_samples_per_step_per_worker': None,
[36m(TaskRunner pid=2823680)[0m                                              'project_name': 'efficiency',
[36m(TaskRunner pid=2823680)[0m                                              'token2text': False},
[36m(TaskRunner pid=2823680)[0m                                    'val_kwargs': {'_target_': 'verl.workers.config.SamplingConfig',
[36m(TaskRunner pid=2823680)[0m                                                   'do_sample': True,
[36m(TaskRunner pid=2823680)[0m                                                   'n': 16,
[36m(TaskRunner pid=2823680)[0m                                                   'n_per_data_source': {'aime2024': 16,
[36m(TaskRunner pid=2823680)[0m                                                                         'aime2025': 16,
[36m(TaskRunner pid=2823680)[0m                                                                         'math500': 4},
[36m(TaskRunner pid=2823680)[0m                                                   'temperature': 0.7,
[36m(TaskRunner pid=2823680)[0m                                                   'top_k': -1,
[36m(TaskRunner pid=2823680)[0m                                                   'top_p': 0.8},
[36m(TaskRunner pid=2823680)[0m                                    'val_response_length': 14336}},
[36m(TaskRunner pid=2823680)[0m  'algorithm': {'_target_': 'verl.trainer.config.AlgoConfig',
[36m(TaskRunner pid=2823680)[0m                'adv_estimator': 'grpo',
[36m(TaskRunner pid=2823680)[0m                'gamma': 1.0,
[36m(TaskRunner pid=2823680)[0m                'kl_ctrl': {'_target_': 'verl.trainer.config.KLControlConfig',
[36m(TaskRunner pid=2823680)[0m                            'horizon': 10000,
[36m(TaskRunner pid=2823680)[0m                            'kl_coef': 0.001,
[36m(TaskRunner pid=2823680)[0m                            'target_kl': 0.1,
[36m(TaskRunner pid=2823680)[0m                            'type': 'fixed'},
[36m(TaskRunner pid=2823680)[0m                'kl_penalty': 'kl',
[36m(TaskRunner pid=2823680)[0m                'lam': 1.0,
[36m(TaskRunner pid=2823680)[0m                'norm_adv_by_std_in_grpo': True,
[36m(TaskRunner pid=2823680)[0m                'pf_ppo': {'reweight_method': 'pow', 'weight_pow': 2.0},
[36m(TaskRunner pid=2823680)[0m                'rollout_correction': {'bypass_mode': False,
[36m(TaskRunner pid=2823680)[0m                                       'loss_type': 'ppo_clip',
[36m(TaskRunner pid=2823680)[0m                                       'rollout_is': None,
[36m(TaskRunner pid=2823680)[0m                                       'rollout_is_batch_normalize': False,
[36m(TaskRunner pid=2823680)[0m                                       'rollout_is_threshold': 2.0,
[36m(TaskRunner pid=2823680)[0m                                       'rollout_rs': None,
[36m(TaskRunner pid=2823680)[0m                                       'rollout_rs_threshold': None},
[36m(TaskRunner pid=2823680)[0m                'use_kl_in_reward': True,
[36m(TaskRunner pid=2823680)[0m                'use_pf_ppo': False},
[36m(TaskRunner pid=2823680)[0m  'critic': {'_target_': 'verl.workers.config.FSDPCriticConfig',
[36m(TaskRunner pid=2823680)[0m             'checkpoint': {'_target_': 'verl.trainer.config.CheckpointConfig',
[36m(TaskRunner pid=2823680)[0m                            'async_save': False,
[36m(TaskRunner pid=2823680)[0m                            'load_contents': ['model', 'optimizer', 'extra'],
[36m(TaskRunner pid=2823680)[0m                            'mbridge_config': {},
[36m(TaskRunner pid=2823680)[0m                            'save_contents': ['model', 'optimizer', 'extra']},
[36m(TaskRunner pid=2823680)[0m             'cliprange_value': 0.5,
[36m(TaskRunner pid=2823680)[0m             'data_loader_seed': 42,
[36m(TaskRunner pid=2823680)[0m             'enable': None,
[36m(TaskRunner pid=2823680)[0m             'forward_max_token_len_per_gpu': 32768,
[36m(TaskRunner pid=2823680)[0m             'forward_micro_batch_size': None,
[36m(TaskRunner pid=2823680)[0m             'forward_micro_batch_size_per_gpu': None,
[36m(TaskRunner pid=2823680)[0m             'grad_clip': 1.0,
[36m(TaskRunner pid=2823680)[0m             'loss_agg_mode': 'token-mean',
[36m(TaskRunner pid=2823680)[0m             'model': {'_target_': 'verl.workers.config.FSDPCriticModelCfg',
[36m(TaskRunner pid=2823680)[0m                       'enable_activation_offload': False,
[36m(TaskRunner pid=2823680)[0m                       'enable_gradient_checkpointing': True,
[36m(TaskRunner pid=2823680)[0m                       'external_lib': None,
[36m(TaskRunner pid=2823680)[0m                       'fsdp_config': {'_target_': 'verl.workers.config.FSDPEngineConfig',
[36m(TaskRunner pid=2823680)[0m                                       'dtype': 'bfloat16',
[36m(TaskRunner pid=2823680)[0m                                       'entropy_checkpointing': False,
[36m(TaskRunner pid=2823680)[0m                                       'entropy_from_logits_with_chunking': False,
[36m(TaskRunner pid=2823680)[0m                                       'forward_only': False,
[36m(TaskRunner pid=2823680)[0m                                       'forward_prefetch': False,
[36m(TaskRunner pid=2823680)[0m                                       'fsdp_size': -1,
[36m(TaskRunner pid=2823680)[0m                                       'full_determinism': False,
[36m(TaskRunner pid=2823680)[0m                                       'model_dtype': 'fp32',
[36m(TaskRunner pid=2823680)[0m                                       'offload_policy': False,
[36m(TaskRunner pid=2823680)[0m                                       'optimizer_offload': False,
[36m(TaskRunner pid=2823680)[0m                                       'param_offload': False,
[36m(TaskRunner pid=2823680)[0m                                       'qat': {'_target_': 'verl.workers.config.QATEngineConfig',
[36m(TaskRunner pid=2823680)[0m                                               'activation_observer': 'static_minmax',
[36m(TaskRunner pid=2823680)[0m                                               'enable': False,
[36m(TaskRunner pid=2823680)[0m                                               'group_size': 16,
[36m(TaskRunner pid=2823680)[0m                                               'ignore_patterns': ['lm_head',
[36m(TaskRunner pid=2823680)[0m                                                                   'embed_tokens',
[36m(TaskRunner pid=2823680)[0m                                                                   're:.*mlp.gate$'],
[36m(TaskRunner pid=2823680)[0m                                               'mode': 'w4a16',
[36m(TaskRunner pid=2823680)[0m                                               'quantization_config_path': None},
[36m(TaskRunner pid=2823680)[0m                                       'reshard_after_forward': True,
[36m(TaskRunner pid=2823680)[0m                                       'seed': 42,
[36m(TaskRunner pid=2823680)[0m                                       'strategy': 'fsdp',
[36m(TaskRunner pid=2823680)[0m                                       'ulysses_sequence_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                       'use_orig_params': False,
[36m(TaskRunner pid=2823680)[0m                                       'use_torch_compile': True,
[36m(TaskRunner pid=2823680)[0m                                       'wrap_policy': {'min_num_params': 0}},
[36m(TaskRunner pid=2823680)[0m                       'lora_alpha': 16,
[36m(TaskRunner pid=2823680)[0m                       'lora_rank': 0,
[36m(TaskRunner pid=2823680)[0m                       'override_config': {},
[36m(TaskRunner pid=2823680)[0m                       'path': '~/models/deepseek-llm-7b-chat',
[36m(TaskRunner pid=2823680)[0m                       'target_modules': 'all-linear',
[36m(TaskRunner pid=2823680)[0m                       'tiled_mlp': {'enabled': False, 'num_shards': 4},
[36m(TaskRunner pid=2823680)[0m                       'tokenizer_path': 'RoadQAQ/Qwen2.5-Math-1.5B-16k-think',
[36m(TaskRunner pid=2823680)[0m                       'trust_remote_code': False,
[36m(TaskRunner pid=2823680)[0m                       'use_remove_padding': False,
[36m(TaskRunner pid=2823680)[0m                       'use_shm': False},
[36m(TaskRunner pid=2823680)[0m             'optim': {'_target_': 'verl.workers.config.FSDPOptimizerConfig',
[36m(TaskRunner pid=2823680)[0m                       'betas': [0.9, 0.999],
[36m(TaskRunner pid=2823680)[0m                       'clip_grad': 1.0,
[36m(TaskRunner pid=2823680)[0m                       'lr': 1e-05,
[36m(TaskRunner pid=2823680)[0m                       'lr_scheduler_type': 'constant',
[36m(TaskRunner pid=2823680)[0m                       'lr_warmup_steps': -1,
[36m(TaskRunner pid=2823680)[0m                       'lr_warmup_steps_ratio': 0.0,
[36m(TaskRunner pid=2823680)[0m                       'min_lr_ratio': 0.0,
[36m(TaskRunner pid=2823680)[0m                       'num_cycles': 0.5,
[36m(TaskRunner pid=2823680)[0m                       'optimizer': 'AdamW',
[36m(TaskRunner pid=2823680)[0m                       'optimizer_impl': 'torch.optim',
[36m(TaskRunner pid=2823680)[0m                       'override_optimizer_config': None,
[36m(TaskRunner pid=2823680)[0m                       'total_training_steps': -1,
[36m(TaskRunner pid=2823680)[0m                       'warmup_style': None,
[36m(TaskRunner pid=2823680)[0m                       'weight_decay': 0.01,
[36m(TaskRunner pid=2823680)[0m                       'zero_indexed_step': True},
[36m(TaskRunner pid=2823680)[0m             'ppo_epochs': 1,
[36m(TaskRunner pid=2823680)[0m             'ppo_max_token_len_per_gpu': 32768,
[36m(TaskRunner pid=2823680)[0m             'ppo_micro_batch_size': None,
[36m(TaskRunner pid=2823680)[0m             'ppo_micro_batch_size_per_gpu': None,
[36m(TaskRunner pid=2823680)[0m             'ppo_mini_batch_size': 8,
[36m(TaskRunner pid=2823680)[0m             'profiler': {'_target_': 'verl.utils.profiler.ProfilerConfig',
[36m(TaskRunner pid=2823680)[0m                          'all_ranks': False,
[36m(TaskRunner pid=2823680)[0m                          'enable': False,
[36m(TaskRunner pid=2823680)[0m                          'ranks': [],
[36m(TaskRunner pid=2823680)[0m                          'save_path': 'outputs/profile',
[36m(TaskRunner pid=2823680)[0m                          'tool': None,
[36m(TaskRunner pid=2823680)[0m                          'tool_config': {'npu': {'_target_': 'verl.utils.profiler.config.NPUToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                  'analysis': True,
[36m(TaskRunner pid=2823680)[0m                                                  'contents': [],
[36m(TaskRunner pid=2823680)[0m                                                  'discrete': False,
[36m(TaskRunner pid=2823680)[0m                                                  'level': 'level0'},
[36m(TaskRunner pid=2823680)[0m                                          'nsys': {'_target_': 'verl.utils.profiler.config.NsightToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                   'discrete': False},
[36m(TaskRunner pid=2823680)[0m                                          'torch': {'_target_': 'verl.utils.profiler.config.TorchProfilerToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                    'contents': [],
[36m(TaskRunner pid=2823680)[0m                                                    'discrete': False},
[36m(TaskRunner pid=2823680)[0m                                          'torch_memory': {'_target_': 'verl.utils.profiler.config.TorchMemoryToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                           'stack_depth': 32,
[36m(TaskRunner pid=2823680)[0m                                                           'trace_alloc_max_entries': 100000}}},
[36m(TaskRunner pid=2823680)[0m             'rollout_n': 8,
[36m(TaskRunner pid=2823680)[0m             'shuffle': False,
[36m(TaskRunner pid=2823680)[0m             'strategy': 'fsdp',
[36m(TaskRunner pid=2823680)[0m             'ulysses_sequence_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m             'use_dynamic_bsz': False},
[36m(TaskRunner pid=2823680)[0m  'data': {'apply_chat_template_kwargs': {},
[36m(TaskRunner pid=2823680)[0m           'cluster_sampler': {'acc_high_threshold': 0.6875,
[36m(TaskRunner pid=2823680)[0m                               'acc_low_threshold': 0.3125,
[36m(TaskRunner pid=2823680)[0m                               'active_clusters': 16,
[36m(TaskRunner pid=2823680)[0m                               'allow_batch_cluster_fallback': True,
[36m(TaskRunner pid=2823680)[0m                               'cluster_key': 'cluster_id',
[36m(TaskRunner pid=2823680)[0m                               'cluster_size_key': 'cluster_size',
[36m(TaskRunner pid=2823680)[0m                               'consecutive_mid_threshold': 2,
[36m(TaskRunner pid=2823680)[0m                               'frontier_advance_consecutive_mid': 16,
[36m(TaskRunner pid=2823680)[0m                               'frontier_advance_hard': 16,
[36m(TaskRunner pid=2823680)[0m                               'frontier_advance_high': 16,
[36m(TaskRunner pid=2823680)[0m                               'frontier_advance_mid': 0,
[36m(TaskRunner pid=2823680)[0m                               'log_top_k': 10,
[36m(TaskRunner pid=2823680)[0m                               'prob_snapshot_log_interval': 1,
[36m(TaskRunner pid=2823680)[0m                               'rank_key': 'rank_in_cluster',
[36m(TaskRunner pid=2823680)[0m                               'sample_id_key': 'sample_id',
[36m(TaskRunner pid=2823680)[0m                               'samples_per_cluster': 8,
[36m(TaskRunner pid=2823680)[0m                               'score_beta': 0.3,
[36m(TaskRunner pid=2823680)[0m                               'score_init': 2.0,
[36m(TaskRunner pid=2823680)[0m                               'score_max': 5.0,
[36m(TaskRunner pid=2823680)[0m                               'score_min': 1.0,
[36m(TaskRunner pid=2823680)[0m                               'seed': 42,
[36m(TaskRunner pid=2823680)[0m                               'target_score_high': 5,
[36m(TaskRunner pid=2823680)[0m                               'target_score_low': 1,
[36m(TaskRunner pid=2823680)[0m                               'target_score_mid': 3,
[36m(TaskRunner pid=2823680)[0m                               'window_size': 32},
[36m(TaskRunner pid=2823680)[0m           'custom_cls': {'name': None, 'path': None},
[36m(TaskRunner pid=2823680)[0m           'datagen': {'name': None, 'path': None},
[36m(TaskRunner pid=2823680)[0m           'dataloader_num_workers': 0,
[36m(TaskRunner pid=2823680)[0m           'filter_overlong_prompts': True,
[36m(TaskRunner pid=2823680)[0m           'filter_overlong_prompts_workers': 1,
[36m(TaskRunner pid=2823680)[0m           'image_key': 'images',
[36m(TaskRunner pid=2823680)[0m           'image_patch_size': 14,
[36m(TaskRunner pid=2823680)[0m           'max_prompt_length': 2048,
[36m(TaskRunner pid=2823680)[0m           'max_response_length': 8192,
[36m(TaskRunner pid=2823680)[0m           'prompt_key': 'prompt',
[36m(TaskRunner pid=2823680)[0m           'return_full_prompt': False,
[36m(TaskRunner pid=2823680)[0m           'return_multi_modal_inputs': True,
[36m(TaskRunner pid=2823680)[0m           'return_raw_chat': True,
[36m(TaskRunner pid=2823680)[0m           'return_raw_input_ids': False,
[36m(TaskRunner pid=2823680)[0m           'reward_fn_key': 'data_source',
[36m(TaskRunner pid=2823680)[0m           'sampler': {'class_name': 'FrontierCurriculumSampler',
[36m(TaskRunner pid=2823680)[0m                       'class_path': 'pkg://verl.experimental.dataset.frontier_sampler'},
[36m(TaskRunner pid=2823680)[0m           'seed': None,
[36m(TaskRunner pid=2823680)[0m           'shuffle': True,
[36m(TaskRunner pid=2823680)[0m           'tokenizer': None,
[36m(TaskRunner pid=2823680)[0m           'tool_config_path': None,
[36m(TaskRunner pid=2823680)[0m           'train_batch_size': 128,
[36m(TaskRunner pid=2823680)[0m           'train_files': '/storage/workspace/server-5/rl/jeremy/efficiency/outputs/openr1_math_46k_8192_quarter_1_5b_roadqaq_cot/phase5/train_clustered_sorted_11448.parquet',
[36m(TaskRunner pid=2823680)[0m           'train_max_samples': -1,
[36m(TaskRunner pid=2823680)[0m           'truncation': 'error',
[36m(TaskRunner pid=2823680)[0m           'trust_remote_code': False,
[36m(TaskRunner pid=2823680)[0m           'use_shm': False,
[36m(TaskRunner pid=2823680)[0m           'val_batch_size': None,
[36m(TaskRunner pid=2823680)[0m           'val_files': ['/storage/workspace/server-5/rl/jeremy/efficiency/dataset/aime2024/test.parquet',
[36m(TaskRunner pid=2823680)[0m                         '/storage/workspace/server-5/rl/jeremy/efficiency/dataset/aime25/test.parquet',
[36m(TaskRunner pid=2823680)[0m                         '/storage/workspace/server-5/rl/jeremy/efficiency/dataset/math500/test.parquet'],
[36m(TaskRunner pid=2823680)[0m           'val_max_samples': -1,
[36m(TaskRunner pid=2823680)[0m           'validation_shuffle': False,
[36m(TaskRunner pid=2823680)[0m           'video_key': 'videos'},
[36m(TaskRunner pid=2823680)[0m  'global_profiler': {'_target_': 'verl.utils.profiler.ProfilerConfig',
[36m(TaskRunner pid=2823680)[0m                      'global_tool_config': {'nsys': {'_target_': 'verl.utils.profiler.config.NsightToolConfig',
[36m(TaskRunner pid=2823680)[0m                                                      'controller_nsight_options': {'cuda-graph-trace': 'graph',
[36m(TaskRunner pid=2823680)[0m                                                                                    'cuda-memory-usage': 'true',
[36m(TaskRunner pid=2823680)[0m                                                                                    'trace': 'cuda,nvtx,cublas,ucx'},
[36m(TaskRunner pid=2823680)[0m                                                      'discrete': False,
[36m(TaskRunner pid=2823680)[0m                                                      'worker_nsight_options': {'capture-range': 'cudaProfilerApi',
[36m(TaskRunner pid=2823680)[0m                                                                                'capture-range-end': None,
[36m(TaskRunner pid=2823680)[0m                                                                                'cuda-graph-trace': 'graph',
[36m(TaskRunner pid=2823680)[0m                                                                                'cuda-memory-usage': 'true',
[36m(TaskRunner pid=2823680)[0m                                                                                'kill': 'none',
[36m(TaskRunner pid=2823680)[0m                                                                                'trace': 'cuda,nvtx,cublas,ucx'}},
[36m(TaskRunner pid=2823680)[0m                                             'torch_memory': {'context': 'all',
[36m(TaskRunner pid=2823680)[0m                                                              'kw_args': {},
[36m(TaskRunner pid=2823680)[0m                                                              'stack_depth': 32,
[36m(TaskRunner pid=2823680)[0m                                                              'stacks': 'all',
[36m(TaskRunner pid=2823680)[0m                                                              'trace_alloc_max_entries': 100000}},
[36m(TaskRunner pid=2823680)[0m                      'profile_continuous_steps': False,
[36m(TaskRunner pid=2823680)[0m                      'save_path': 'outputs/profile',
[36m(TaskRunner pid=2823680)[0m                      'steps': None,
[36m(TaskRunner pid=2823680)[0m                      'tool': None},
[36m(TaskRunner pid=2823680)[0m  'ray_kwargs': {'ray_init': {'num_cpus': None}, 'timeline_json_file': None},
[36m(TaskRunner pid=2823680)[0m  'reward': {'custom_reward_function': {'name': 'compute_score',
[36m(TaskRunner pid=2823680)[0m                                        'path': '/storage/workspace/server-5/rl/jeremy/efficiency/verl/examples/grpo_trainer/reward_boxed_binary.py'},
[36m(TaskRunner pid=2823680)[0m             'num_workers': 8,
[36m(TaskRunner pid=2823680)[0m             'reward_manager': {'_target_': 'verl.workers.config.reward_model.RewardManagerConfig',
[36m(TaskRunner pid=2823680)[0m                                'module': {'_target_': 'verl.trainer.config.config.ModuleConfig',
[36m(TaskRunner pid=2823680)[0m                                           'name': 'custom_reward_manager',
[36m(TaskRunner pid=2823680)[0m                                           'path': None},
[36m(TaskRunner pid=2823680)[0m                                'name': 'naive',
[36m(TaskRunner pid=2823680)[0m                                'source': 'register'},
[36m(TaskRunner pid=2823680)[0m             'reward_model': {'enable': False,
[36m(TaskRunner pid=2823680)[0m                              'enable_resource_pool': False,
[36m(TaskRunner pid=2823680)[0m                              'model_path': None,
[36m(TaskRunner pid=2823680)[0m                              'n_gpus_per_node': 8,
[36m(TaskRunner pid=2823680)[0m                              'nnodes': 0,
[36m(TaskRunner pid=2823680)[0m                              'rollout': {'_target_': 'verl.workers.config.RolloutConfig',
[36m(TaskRunner pid=2823680)[0m                                          'cudagraph_capture_sizes': None,
[36m(TaskRunner pid=2823680)[0m                                          'data_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                          'disable_log_stats': True,
[36m(TaskRunner pid=2823680)[0m                                          'dtype': 'bfloat16',
[36m(TaskRunner pid=2823680)[0m                                          'enable_chunked_prefill': True,
[36m(TaskRunner pid=2823680)[0m                                          'enable_prefix_caching': True,
[36m(TaskRunner pid=2823680)[0m                                          'enforce_eager': True,
[36m(TaskRunner pid=2823680)[0m                                          'engine_kwargs': {},
[36m(TaskRunner pid=2823680)[0m                                          'expert_parallel_size': 1,
[36m(TaskRunner pid=2823680)[0m                                          'free_cache_engine': True,
[36m(TaskRunner pid=2823680)[0m                                          'gpu_memory_utilization': 0.5,
[36m(TaskRunner pid=2823680)[0m                                          'limit_images': None,
[36m(TaskRunner pid=2823680)[0m                                          'load_format': 'auto',
[36m(TaskRunner pid=2823680)[0m                                          'max_model_len': None,
[36m(TaskRunner pid=2823680)[0m                                          'max_num_batched_tokens': 8192,
[36m(TaskRunner pid=2823680)[0m                                          'max_num_seqs': 1024,
[36m(TaskRunner pid=2823680)[0m                                          'name': '???',
[36m(TaskRunner pid=2823680)[0m                                          'prompt_length': 2048,
[36m(TaskRunner pid=2823680)[0m                                          'response_length': 2048,
[36m(TaskRunner pid=2823680)[0m                                          'skip_tokenizer_init': False,
[36m(TaskRunner pid=2823680)[0m                                          'tensor_model_parallel_size': 2}},
[36m(TaskRunner pid=2823680)[0m             'sandbox_fusion': {'max_concurrent': 64,
[36m(TaskRunner pid=2823680)[0m                                'memory_limit_mb': 1024,
[36m(TaskRunner pid=2823680)[0m                                'url': None}},
[36m(TaskRunner pid=2823680)[0m  'trainer': {'balance_batch': True,
[36m(TaskRunner pid=2823680)[0m              'best_ckpt_metric_key': 'val-core/aime2025/acc/best@16/mean',
[36m(TaskRunner pid=2823680)[0m              'best_ckpt_mode': 'max',
[36m(TaskRunner pid=2823680)[0m              'critic_warmup': 0,
[36m(TaskRunner pid=2823680)[0m              'default_hdfs_dir': None,
[36m(TaskRunner pid=2823680)[0m              'default_local_dir': '/storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633',
[36m(TaskRunner pid=2823680)[0m              'del_local_ckpt_after_load': False,
[36m(TaskRunner pid=2823680)[0m              'device': 'cuda',
[36m(TaskRunner pid=2823680)[0m              'esi_redundant_time': 0,
[36m(TaskRunner pid=2823680)[0m              'experiment_name': 'efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633',
[36m(TaskRunner pid=2823680)[0m              'log_val_generations': 0,
[36m(TaskRunner pid=2823680)[0m              'logger': ['console', 'wandb'],
[36m(TaskRunner pid=2823680)[0m              'max_actor_ckpt_to_keep': None,
[36m(TaskRunner pid=2823680)[0m              'max_critic_ckpt_to_keep': None,
[36m(TaskRunner pid=2823680)[0m              'n_gpus_per_node': 4,
[36m(TaskRunner pid=2823680)[0m              'nnodes': 1,
[36m(TaskRunner pid=2823680)[0m              'project_name': 'efficiency',
[36m(TaskRunner pid=2823680)[0m              'ray_wait_register_center_timeout': 300,
[36m(TaskRunner pid=2823680)[0m              'resume_from_path': None,
[36m(TaskRunner pid=2823680)[0m              'resume_mode': 'disable',
[36m(TaskRunner pid=2823680)[0m              'rollout_data_dir': None,
[36m(TaskRunner pid=2823680)[0m              'save_best_val_checkpoint': True,
[36m(TaskRunner pid=2823680)[0m              'save_freq': 25,
[36m(TaskRunner pid=2823680)[0m              'test_freq': 25,
[36m(TaskRunner pid=2823680)[0m              'total_epochs': 10,
[36m(TaskRunner pid=2823680)[0m              'total_training_steps': 800,
[36m(TaskRunner pid=2823680)[0m              'use_legacy_worker_impl': 'auto',
[36m(TaskRunner pid=2823680)[0m              'val_before_train': True,
[36m(TaskRunner pid=2823680)[0m              'val_only': False,
[36m(TaskRunner pid=2823680)[0m              'validation_data_dir': None},
[36m(TaskRunner pid=2823680)[0m  'transfer_queue': {'enable': False}}
[36m(TaskRunner pid=2823680)[0m [validate_config] All configuration checks passed successfully!
[36m(TaskRunner pid=2823680)[0m Using dataset class: RLHFDataset
[36m(TaskRunner pid=2823680)[0m Generating train split: 0 examples [00:00, ? examples/s]
[36m(TaskRunner pid=2823680)[0m Generating train split: 11448 examples [00:00, 28935.33 examples/s]
[36m(TaskRunner pid=2823680)[0m Generating train split: 11448 examples [00:00, 12331.73 examples/s]
[36m(TaskRunner pid=2823680)[0m dataset len: 11448
[36m(TaskRunner pid=2823680)[0m Setting TOKENIZERS_PARALLELISM=false for forked processes.
[36m(TaskRunner pid=2823680)[0m WARNING:2026-04-12 11:27:21,177:Setting TOKENIZERS_PARALLELISM=false for forked processes.
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):   0%|          | 0/11448 [00:00<?, ? examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):   9%|▊         | 1000/11448 [00:01<00:12, 844.51 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  17%|█▋        | 2000/11448 [00:01<00:07, 1182.23 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  26%|██▌       | 3000/11448 [00:02<00:06, 1289.21 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  35%|███▍      | 4000/11448 [00:03<00:05, 1356.85 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  44%|████▎     | 5000/11448 [00:03<00:04, 1418.72 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  52%|█████▏    | 6000/11448 [00:04<00:03, 1451.52 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  61%|██████    | 7000/11448 [00:05<00:02, 1487.25 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  70%|██████▉   | 8000/11448 [00:05<00:02, 1510.39 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  79%|███████▊  | 9000/11448 [00:06<00:01, 1538.79 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  87%|████████▋ | 10000/11448 [00:06<00:00, 1563.29 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):  96%|█████████▌| 11000/11448 [00:07<00:00, 1614.40 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1): 100%|██████████| 11448/11448 [00:07<00:00, 1583.93 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1): 100%|██████████| 11448/11448 [00:08<00:00, 1430.38 examples/s]
[36m(TaskRunner pid=2823680)[0m filter dataset len: 11447
[36m(TaskRunner pid=2823680)[0m Using dataset class: RLHFDataset
[36m(TaskRunner pid=2823680)[0m dataset len: 560
[36m(TaskRunner pid=2823680)[0m Setting TOKENIZERS_PARALLELISM=false for forked processes.
[36m(TaskRunner pid=2823680)[0m WARNING:2026-04-12 11:27:29,615:Setting TOKENIZERS_PARALLELISM=false for forked processes.
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1):   0%|          | 0/560 [00:00<?, ? examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1): 100%|██████████| 560/560 [00:00<00:00, 589.17 examples/s]
[36m(TaskRunner pid=2823680)[0m Filtering prompts longer than 2048 tokens (num_proc=1): 100%|██████████| 560/560 [00:01<00:00, 524.81 examples/s]
[36m(TaskRunner pid=2823680)[0m filter dataset len: 560
[36m(TaskRunner pid=2823680)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/trainer/ppo/ray_trainer.py:290: UserWarning: Disabled critic as algorithm.adv_estimator != gae. If it is not intended, please set critic.enable=True
[36m(TaskRunner pid=2823680)[0m   self.use_critic = need_critic(self.config)
[36m(TaskRunner pid=2823680)[0m Size of train dataloader: 89, Size of val dataloader: 1
[36m(TaskRunner pid=2823680)[0m Total training steps: 800
[36m(TaskRunner pid=2823680)[0m colocated worker base class <class 'verl.single_controller.base.worker.Worker'>
[36m(WorkerDict pid=2825158)[0m [Gloo] Rank 1 is connected to 3 peer ranks. Expected number of connected peer ranks is : 3
[36m(WorkerDict pid=2825157)[0m reference model: RoadQAQ/Qwen2.5-Math-1.5B-16k-think
[36m(WorkerDict pid=2825157)[0m Model config after override: Qwen2Config {
[36m(WorkerDict pid=2825157)[0m   "architectures": [
[36m(WorkerDict pid=2825157)[0m     "Qwen2ForCausalLM"
[36m(WorkerDict pid=2825157)[0m   ],
[36m(WorkerDict pid=2825157)[0m   "attention_dropout": 0.0,
[36m(WorkerDict pid=2825157)[0m   "attn_implementation": "flash_attention_2",
[36m(WorkerDict pid=2825157)[0m   "dtype": "bfloat16",
[36m(WorkerDict pid=2825157)[0m   "eos_token_id": 151643,
[36m(WorkerDict pid=2825157)[0m   "hidden_act": "silu",
[36m(WorkerDict pid=2825157)[0m   "hidden_size": 1536,
[36m(WorkerDict pid=2825157)[0m   "initializer_range": 0.02,
[36m(WorkerDict pid=2825157)[0m   "intermediate_size": 8960,
[36m(WorkerDict pid=2825157)[0m   "layer_types": [
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention"
[36m(WorkerDict pid=2825157)[0m   ],
[36m(WorkerDict pid=2825157)[0m   "max_position_embeddings": 16384,
[36m(WorkerDict pid=2825157)[0m   "max_window_layers": 21,
[36m(WorkerDict pid=2825157)[0m   "model_type": "qwen2",
[36m(WorkerDict pid=2825157)[0m   "num_attention_heads": 12,
[36m(WorkerDict pid=2825157)[0m   "num_hidden_layers": 28,
[36m(WorkerDict pid=2825157)[0m   "num_key_value_heads": 2,
[36m(WorkerDict pid=2825157)[0m   "pad_token_id": 151643,
[36m(WorkerDict pid=2825157)[0m   "rms_norm_eps": 1e-06,
[36m(WorkerDict pid=2825157)[0m   "rope_scaling": null,
[36m(WorkerDict pid=2825157)[0m   "rope_theta": 40000,
[36m(WorkerDict pid=2825157)[0m   "sliding_window": null,
[36m(WorkerDict pid=2825157)[0m   "tie_word_embeddings": true,
[36m(WorkerDict pid=2825157)[0m   "transformers_version": "4.57.6",
[36m(WorkerDict pid=2825157)[0m   "use_cache": true,
[36m(WorkerDict pid=2825157)[0m   "use_mrope": false,
[36m(WorkerDict pid=2825157)[0m   "use_sliding_window": false,
[36m(WorkerDict pid=2825157)[0m   "vocab_size": 151936
[36m(WorkerDict pid=2825157)[0m }
[36m(WorkerDict pid=2825157)[0m 
[36m(WorkerDict pid=2825160)[0m `torch_dtype` is deprecated! Use `dtype` instead!
[36m(WorkerDict pid=2825160)[0m Flash Attention 2 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in Qwen2ForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via the `with torch.autocast(device_type='torch_device'):` decorator, or load the model with the `dtype` argument. Example: `model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", dtype=torch.float16)`
[36m(WorkerDict pid=2825158)[0m Monkey patch _flash_attention_forward in transformers.integrations.flash_attention
[36m(WorkerDict pid=2825158)[0m Skipping monkey patch for Qwen2ForCausalLM as use_fused_kernels is False or fused_kernels_backend is torch
[36m(WorkerDict pid=2825159)[0m [Gloo] Rank 2 is connected to 3 peer ranks. Expected number of connected peer ranks is : 3[32m [repeated 3x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)[0m
[36m(WorkerDict pid=2825157)[0m Qwen2ForCausalLM contains 1.54B parameters
[36m(WorkerDict pid=2825157)[0m wrap_policy: functools.partial(<function _or_policy at 0x7f8f4cce3010>, policies=[functools.partial(<function transformer_auto_wrap_policy at 0x7f8f4cce2ef0>, transformer_layer_cls={<class 'transformers.models.qwen2.modeling_qwen2.Qwen2DecoderLayer'>})])
[36m(WorkerDict pid=2825157)[0m NCCL version 2.27.5+cuda12.9
[36m(WorkerDict pid=2825157)[0m Ref use_remove_padding=True
[36m(WorkerDict pid=2825157)[0m Ref use_fused_kernels=False
[36m(WorkerDict pid=2825157)[0m Ref use_prefix_grouper=False
[36m(WorkerDict pid=2825159)[0m Monkey patch _flash_attention_forward in transformers.integrations.flash_attention[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m Skipping monkey patch for Qwen2ForCausalLM as use_fused_kernels is False or fused_kernels_backend is torch[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825157)[0m Model config after override: Qwen2Config {
[36m(WorkerDict pid=2825157)[0m   "architectures": [
[36m(WorkerDict pid=2825157)[0m     "Qwen2ForCausalLM"
[36m(WorkerDict pid=2825157)[0m   ],
[36m(WorkerDict pid=2825157)[0m   "attention_dropout": 0.0,
[36m(WorkerDict pid=2825157)[0m   "attn_implementation": "flash_attention_2",
[36m(WorkerDict pid=2825157)[0m   "dtype": "bfloat16",
[36m(WorkerDict pid=2825157)[0m   "eos_token_id": 151643,
[36m(WorkerDict pid=2825157)[0m   "hidden_act": "silu",
[36m(WorkerDict pid=2825157)[0m   "hidden_size": 1536,
[36m(WorkerDict pid=2825157)[0m   "initializer_range": 0.02,
[36m(WorkerDict pid=2825157)[0m   "intermediate_size": 8960,
[36m(WorkerDict pid=2825157)[0m   "layer_types": [
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention",
[36m(WorkerDict pid=2825157)[0m     "full_attention"
[36m(WorkerDict pid=2825157)[0m   ],
[36m(WorkerDict pid=2825157)[0m   "max_position_embeddings": 16384,
[36m(WorkerDict pid=2825157)[0m   "max_window_layers": 21,
[36m(WorkerDict pid=2825157)[0m   "model_type": "qwen2",
[36m(WorkerDict pid=2825157)[0m   "num_attention_heads": 12,
[36m(WorkerDict pid=2825157)[0m   "num_hidden_layers": 28,
[36m(WorkerDict pid=2825157)[0m   "num_key_value_heads": 2,
[36m(WorkerDict pid=2825157)[0m   "pad_token_id": 151643,
[36m(WorkerDict pid=2825157)[0m   "rms_norm_eps": 1e-06,
[36m(WorkerDict pid=2825157)[0m   "rope_scaling": null,
[36m(WorkerDict pid=2825157)[0m   "rope_theta": 40000,
[36m(WorkerDict pid=2825157)[0m   "sliding_window": null,
[36m(WorkerDict pid=2825157)[0m   "tie_word_embeddings": true,
[36m(WorkerDict pid=2825157)[0m   "transformers_version": "4.57.6",
[36m(WorkerDict pid=2825157)[0m   "use_cache": true,
[36m(WorkerDict pid=2825157)[0m   "use_mrope": false,
[36m(WorkerDict pid=2825157)[0m   "use_sliding_window": false,
[36m(WorkerDict pid=2825157)[0m   "vocab_size": 151936
[36m(WorkerDict pid=2825157)[0m }
[36m(WorkerDict pid=2825157)[0m 
[36m(WorkerDict pid=2825158)[0m Monkey patch _flash_attention_forward in transformers.integrations.flash_attention
[36m(WorkerDict pid=2825158)[0m Skipping monkey patch for Qwen2ForCausalLM as use_fused_kernels is False or fused_kernels_backend is torch
[36m(WorkerDict pid=2825160)[0m Monkey patch _flash_attention_forward in transformers.integrations.flash_attention
[36m(WorkerDict pid=2825160)[0m Skipping monkey patch for Qwen2ForCausalLM as use_fused_kernels is False or fused_kernels_backend is torch
[36m(WorkerDict pid=2825157)[0m Qwen2ForCausalLM contains 1.54B parameters
[36m(WorkerDict pid=2825157)[0m wrap_policy: functools.partial(<function _or_policy at 0x7f8f4cce3010>, policies=[functools.partial(<function transformer_auto_wrap_policy at 0x7f8f4cce2ef0>, transformer_layer_cls={<class 'transformers.models.qwen2.modeling_qwen2.Qwen2DecoderLayer'>})])
[36m(WorkerDict pid=2825157)[0m Total steps: 800, num_warmup_steps: 0
[36m(WorkerDict pid=2825157)[0m Actor use_remove_padding=True
[36m(WorkerDict pid=2825157)[0m Actor use_fused_kernels=False
[36m(WorkerDict pid=2825157)[0m Actor use_prefix_grouper=False
[36m(WorkerDict pid=2825160)[0m [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[36m(WorkerDict pid=2825160)[0m [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/workers/fsdp_workers.py:727: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
[36m(WorkerDict pid=2825160)[0m   FSDP.set_state_dict_type(
[36m(WorkerDict pid=2825159)[0m `torch_dtype` is deprecated! Use `dtype` instead![32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m Flash Attention 2 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in Qwen2Model is torch.float32. You should run training or inference using Automatic Mixed-Precision via the `with torch.autocast(device_type='torch_device'):` decorator, or load the model with the `dtype` argument. Example: `model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", dtype=torch.float16)`[32m [repeated 7x across cluster][0m
[36m(TaskRunner pid=2823680)[0m W0412 11:28:11.798000 2823680 site-packages/torch/utils/cpp_extension.py:118] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
[36m(vLLMHttpServer pid=2827322)[0m ['serve',
[36m(vLLMHttpServer pid=2827322)[0m  'RoadQAQ/Qwen2.5-Math-1.5B-16k-think',
[36m(vLLMHttpServer pid=2827322)[0m  '--dtype',
[36m(vLLMHttpServer pid=2827322)[0m  'bfloat16',
[36m(vLLMHttpServer pid=2827322)[0m  '--load_format',
[36m(vLLMHttpServer pid=2827322)[0m  'dummy',
[36m(vLLMHttpServer pid=2827322)[0m  '--distributed_executor_backend',
[36m(vLLMHttpServer pid=2827322)[0m  'uni',
[36m(vLLMHttpServer pid=2827322)[0m  '--worker_extension_cls',
[36m(vLLMHttpServer pid=2827322)[0m  'verl.workers.rollout.vllm_rollout.utils.vLLMColocateWorkerExtension',
[36m(vLLMHttpServer pid=2827322)[0m  '--max_model_len',
[36m(vLLMHttpServer pid=2827322)[0m  '16384',
[36m(vLLMHttpServer pid=2827322)[0m  '--max_num_seqs',
[36m(vLLMHttpServer pid=2827322)[0m  '1024',
[36m(vLLMHttpServer pid=2827322)[0m  '--enable_chunked_prefill',
[36m(vLLMHttpServer pid=2827322)[0m  '--max_num_batched_tokens',
[36m(vLLMHttpServer pid=2827322)[0m  '8192',
[36m(vLLMHttpServer pid=2827322)[0m  '--enable_prefix_caching',
[36m(vLLMHttpServer pid=2827322)[0m  '--enable_sleep_mode',
[36m(vLLMHttpServer pid=2827322)[0m  '--logprobs_mode',
[36m(vLLMHttpServer pid=2827322)[0m  'processed_logprobs',
[36m(vLLMHttpServer pid=2827322)[0m  '--gpu_memory_utilization',
[36m(vLLMHttpServer pid=2827322)[0m  '0.6',
[36m(vLLMHttpServer pid=2827322)[0m  '--disable_log_stats',
[36m(vLLMHttpServer pid=2827322)[0m  '--tensor_parallel_size',
[36m(vLLMHttpServer pid=2827322)[0m  '1',
[36m(vLLMHttpServer pid=2827322)[0m  '--seed',
[36m(vLLMHttpServer pid=2827322)[0m  '0',
[36m(vLLMHttpServer pid=2827322)[0m  '--override_generation_config',
[36m(vLLMHttpServer pid=2827322)[0m  '{"temperature": 1.0, "top_k": -1, "top_p": 1, "repetition_penalty": 1.0, '
[36m(vLLMHttpServer pid=2827322)[0m  '"max_new_tokens": 8192}',
[36m(vLLMHttpServer pid=2827322)[0m  '--hf_overrides',
[36m(vLLMHttpServer pid=2827322)[0m  '{}',
[36m(vLLMHttpServer pid=2827322)[0m  '--scheduling_policy',
[36m(vLLMHttpServer pid=2827322)[0m  'fcfs',
[36m(vLLMHttpServer pid=2827322)[0m  '--compilation_config',
[36m(vLLMHttpServer pid=2827322)[0m  '{"cudagraph_mode": "FULL_AND_PIECEWISE"}']
[36m(WorkerDict pid=2825159)[0m Monkey patch _flash_attention_forward in transformers.integrations.flash_attention[32m [repeated 2x across cluster][0m
[36m(WorkerDict pid=2825159)[0m Skipping monkey patch for Qwen2ForCausalLM as use_fused_kernels is False or fused_kernels_backend is torch[32m [repeated 2x across cluster][0m
[36m(WorkerDict pid=2825157)[0m [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0[32m [repeated 6x across cluster][0m
[36m(vLLMHttpServer pid=2827322)[0m WARNING 04-12 11:28:34 [system_utils.py:152] We must use the `spawn` multiprocessing start method. Overriding VLLM_WORKER_MULTIPROC_METHOD to 'spawn'. See https://docs.vllm.ai/en/latest/usage/troubleshooting.html#python-multiprocessing for more information. Reasons: In a Ray actor and can only be spawned; CUDA is initialized
[36m(vLLMHttpServer pid=2827320)[0m (EngineCore_DP0 pid=2828264) <frozen importlib._bootstrap_external>:1184: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead.
[36m(vLLMHttpServer pid=2827320)[0m (EngineCore_DP0 pid=2828264) <frozen importlib._bootstrap_external>:1184: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead.
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/workers/fsdp_workers.py:727: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   FSDP.set_state_dict_type([32m [repeated 3x across cluster][0m
[36m(vLLMHttpServer pid=2827321)[0m (EngineCore_DP0 pid=2828243) Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   0%|          | 0/51 [00:00<?, ?it/s]
[36m(vLLMHttpServer pid=2827322)[0m (EngineCore_DP0 pid=2828236) <frozen importlib._bootstrap_external>:1184: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead.[32m [repeated 3x across cluster][0m
[36m(vLLMHttpServer pid=2827322)[0m (EngineCore_DP0 pid=2828236) <frozen importlib._bootstrap_external>:1184: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead.[32m [repeated 3x across cluster][0m
[36m(vLLMHttpServer pid=2827321)[0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   2%|▏         | 1/51 [00:00<00:06,  7.63it/s]
[36m(vLLMHttpServer pid=2827321)[0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   4%|▍         | 2/51 [00:00<00:11,  4.17it/s]
[36m(vLLMHttpServer pid=2827321)[0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   6%|▌         | 3/51 [00:00<00:14,  3.36it/s]
[36m(vLLMHttpServer pid=2827321)[0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  94%|█████████▍| 48/51 [00:04<00:00,  6.58it/s]
[36m(vLLMHttpServer pid=2827321)[0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 100%|██████████| 51/51 [00:04<00:00,  9.00it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 100%|██████████| 51/51 [00:04<00:00, 11.71it/s]
[36m(vLLMHttpServer pid=2827321)[0m (EngineCore_DP0 pid=2828243) Capturing CUDA graphs (decode, FULL):   0%|          | 0/51 [00:00<?, ?it/s]
[36m(vLLMHttpServer pid=2827321)[0m Capturing CUDA graphs (decode, FULL):   6%|▌         | 3/51 [00:00<00:01, 27.14it/s]
[36m(vLLMHttpServer pid=2827322)[0m (EngineCore_DP0 pid=2828236) Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   0%|          | 0/51 [00:00<?, ?it/s][32m [repeated 3x across cluster][0m
[36m(vLLMHttpServer pid=2827322)[0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  88%|████████▊ | 45/51 [00:03<00:00, 11.06it/s][32m [repeated 87x across cluster][0m
[36m(vLLMHttpServer pid=2827321)[0m Capturing CUDA graphs (decode, FULL):  94%|█████████▍| 48/51 [00:02<00:00,  8.12it/s]
[36m(vLLMHttpServer pid=2827319)[0m Capturing CUDA graphs (decode, FULL): 100%|██████████| 51/51 [00:02<00:00, 12.62it/s]Capturing CUDA graphs (decode, FULL): 100%|██████████| 51/51 [00:02<00:00, 17.17it/s]
[36m(vLLMHttpServer pid=2827321)[0m WARNING 04-12 11:29:15 [api_router.py:30] LoRA dynamic loading & unloading is enabled in the API server. This should ONLY be used for local development!
[36m(vLLMHttpServer pid=2827319)[0m WARNING 04-12 11:28:35 [system_utils.py:152] We must use the `spawn` multiprocessing start method. Overriding VLLM_WORKER_MULTIPROC_METHOD to 'spawn'. See https://docs.vllm.ai/en/latest/usage/troubleshooting.html#python-multiprocessing for more information. Reasons: In a Ray actor and can only be spawned; CUDA is initialized[32m [repeated 3x across cluster][0m
[36m(vLLMHttpServer pid=2827319)[0m WARNING 04-12 11:29:15 [model.py:1355] Default vLLM sampling parameters have been overridden by the model's `generation_config.json`: `{'repetition_penalty': 1.0, 'temperature': 1.0, 'top_k': -1, 'top_p': 1, 'max_tokens': 8192}`. If this is not intended, please relaunch vLLM instance with `--generation-config vllm`.
[36m(TaskRunner pid=2823680)[0m AgentLoopManager: ['10.108.5.62:46729', '10.108.5.62:44067', '10.108.5.62:42311', '10.108.5.62:41221']
[36m(TaskRunner pid=2823680)[0m wandb: [wandb.login()] Loaded credentials for https://api.wandb.ai from WANDB_API_KEY.
[36m(vLLMHttpServer pid=2827322)[0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  92%|█████████▏| 47/51 [00:03<00:00, 12.57it/s][32m [repeated 5x across cluster][0m
[36m(vLLMHttpServer pid=2827322)[0m Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  98%|█████████▊| 50/51 [00:03<00:00, 15.32it/s]Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 100%|██████████| 51/51 [00:03<00:00, 14.25it/s][32m [repeated 2x across cluster][0m
[36m(vLLMHttpServer pid=2827322)[0m (EngineCore_DP0 pid=2828236) Capturing CUDA graphs (decode, FULL):   0%|          | 0/51 [00:00<?, ?it/s][32m [repeated 3x across cluster][0m
[36m(vLLMHttpServer pid=2827322)[0m Capturing CUDA graphs (decode, FULL):  90%|█████████ | 46/51 [00:02<00:00, 17.63it/s][32m [repeated 66x across cluster][0m
[36m(vLLMHttpServer pid=2827322)[0m Capturing CUDA graphs (decode, FULL): 100%|██████████| 51/51 [00:02<00:00, 17.43it/s][32m [repeated 6x across cluster][0m
[36m(TaskRunner pid=2823680)[0m wandb: Currently logged in as: dgbtlardd (dgbtlardd-johns-hopkins-university) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
[36m(TaskRunner pid=2823680)[0m wandb: setting up run xl3x2s5j
[36m(TaskRunner pid=2823680)[0m wandb: Tracking run with wandb version 0.25.1
[36m(TaskRunner pid=2823680)[0m wandb: Run data is saved locally in /storage/workspace/server-5/rl/jeremy/efficiency/verl/wandb/run-20260412_112920-xl3x2s5j
[36m(TaskRunner pid=2823680)[0m wandb: Run `wandb offline` to turn off syncing.
[36m(TaskRunner pid=2823680)[0m wandb: Syncing run efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633
[36m(TaskRunner pid=2823680)[0m wandb: ⭐️ View project at https://wandb.ai/dgbtlardd-johns-hopkins-university/efficiency
[36m(TaskRunner pid=2823680)[0m wandb: 🚀 View run at https://wandb.ai/dgbtlardd-johns-hopkins-university/efficiency/runs/xl3x2s5j
[36m(TaskRunner pid=2823680)[0m wandb: Detected [openai] in use.
[36m(TaskRunner pid=2823680)[0m wandb: Use W&B Weave for improved LLM call tracing. Install Weave with `pip install weave` then add `import weave` to the top of your script.
[36m(TaskRunner pid=2823680)[0m wandb: For more information, check out the docs at: https://weave-docs.wandb.ai/
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 0}
[36m(vLLMHttpServer pid=2827322)[0m WARNING 04-12 11:29:15 [api_router.py:30] LoRA dynamic loading & unloading is enabled in the API server. This should ONLY be used for local development![32m [repeated 3x across cluster][0m
[36m(vLLMHttpServer pid=2827322)[0m WARNING 04-12 11:29:16 [model.py:1355] Default vLLM sampling parameters have been overridden by the model's `generation_config.json`: `{'repetition_penalty': 1.0, 'temperature': 1.0, 'top_k': -1, 'top_p': 1, 'max_tokens': 8192}`. If this is not intended, please relaunch vLLM instance with `--generation-config vllm`.[32m [repeated 3x across cluster][0m
[36m(AgentLoopWorker pid=2830803)[0m W0412 11:29:31.269000 2830803 site-packages/torch/utils/cpp_extension.py:118] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(AgentLoopWorker pid=2830804)[0m Using dataset class: RLHFDataset
[36m(vLLMHttpServer pid=2827322)[0m WARNING 04-12 11:29:38 [input_processor.py:254] Passing raw prompts to InputProcessor is deprecated and will be removed in v0.18. You should instead pass the outputs of Renderer.render_cmpl() or Renderer.render_chat().
[36m(AgentLoopWorker pid=2830812)[0m You're using a Qwen2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
[36m(AgentLoopWorker pid=2830810)[0m W0412 11:29:31.268000 2830810 site-packages/torch/utils/cpp_extension.py:118] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'[32m [repeated 7x across cluster][0m
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 11:30:03,954:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(AgentLoopWorker pid=2830803)[0m You're using a Qwen2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.[32m [repeated 7x across cluster][0m
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(AgentLoopWorker pid=2830810)[0m Using dataset class: RLHFDataset[32m [repeated 7x across cluster][0m
[36m(vLLMHttpServer pid=2827321)[0m WARNING 04-12 11:29:38 [input_processor.py:254] Passing raw prompts to InputProcessor is deprecated and will be removed in v0.18. You should instead pass the outputs of Renderer.render_cmpl() or Renderer.render_chat().[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m ("Initial validation metrics: {'val-aux/aime2024/reward/mean@16': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.027083333333333334), 'val-aux/aime2024/reward/std@16': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.07585646105399668), 'val-aux/aime2024/reward/best@2/mean': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.05056666666666667), 'val-aux/aime2024/reward/best@2/std': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.09832020110727895), 'val-aux/aime2024/reward/worst@2/mean': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.004266666666666667), 'val-aux/aime2024/reward/worst@2/std': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.026802003333333345), 'val-aux/aime2024/reward/maj@2/mean': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.026600000000000002), 'val-aux/aime2024/reward/maj@2/std': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.07612171202369464), 'val-aux/aime2024/reward/best@4/mean': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.0888), 'val-aux/aime2024/reward/best@4/std': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.11694868915101046), 'val-aux/aime2024/reward/worst@4/mean': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.0002666666666666667), 'val-aux/aime2024/reward/worst@4/std': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.0038326539548979013), 'val-aux/aime2024/reward/maj@4/mean': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.02583333333333333), 'val-aux/aime2024/reward/maj@4/std': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.06797213319569753), 'val-aux/aime2024/reward/best@8/mean': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.13879999999999998), 'val-aux/aime2024/reward/best@8/std': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.12174840560950617), 'val-aux/aime2024/reward/worst@8/mean': "
[36m(TaskRunner pid=2823680)[0m  "np.float64(0.0), 'val-aux/aime2024/reward/worst@8/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/reward/maj@8/mean': np.float64(0.027533333333333333), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/reward/maj@8/std': np.float64(0.05736516116790047), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/reward/best@16/mean': np.float64(0.19306666666666666), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/reward/best@16/std': np.float64(0.10641941716877429), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/reward/worst@16/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/reward/worst@16/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/reward/maj@16/mean': np.float64(0.0323), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/reward/maj@16/std': np.float64(0.050519292373600734), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/mean@16': np.float64(0.027083333333333334), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/std@16': np.float64(0.07585646105399668), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/best@2/mean': np.float64(0.05056666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/best@2/std': np.float64(0.09832020110727895), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/worst@2/mean': np.float64(0.004266666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/worst@2/std': np.float64(0.026802003333333345), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/maj@2/mean': np.float64(0.026600000000000002), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/maj@2/std': np.float64(0.07612171202369464), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/best@4/mean': np.float64(0.0888), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/best@4/std': np.float64(0.11694868915101046), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/worst@4/mean': np.float64(0.0002666666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/worst@4/std': np.float64(0.0038326539548979013), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/maj@4/mean': np.float64(0.02583333333333333), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/maj@4/std': np.float64(0.06797213319569753), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/best@8/mean': np.float64(0.13879999999999998), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/best@8/std': np.float64(0.12174840560950617), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/worst@8/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/worst@8/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/maj@8/mean': np.float64(0.027533333333333333), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/maj@8/std': np.float64(0.05736516116790047), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/best@16/mean': np.float64(0.19306666666666666), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/best@16/std': np.float64(0.10641941716877429), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/worst@16/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/worst@16/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/maj@16/mean': np.float64(0.0323), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/score/maj@16/std': np.float64(0.050519292373600734), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2024/acc/mean@16': np.float64(0.027083333333333334), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/std@16': np.float64(0.07585646105399668), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/best@2/mean': np.float64(0.05056666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/best@2/std': np.float64(0.09832020110727895), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/worst@2/mean': np.float64(0.004266666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/worst@2/std': np.float64(0.026802003333333345), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/maj@2/mean': np.float64(0.026600000000000002), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/maj@2/std': np.float64(0.07612171202369464), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/best@4/mean': np.float64(0.0888), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/best@4/std': np.float64(0.11694868915101046), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/worst@4/mean': np.float64(0.0002666666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/worst@4/std': np.float64(0.0038326539548979013), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/maj@4/mean': np.float64(0.02583333333333333), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/maj@4/std': np.float64(0.06797213319569753), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/best@8/mean': np.float64(0.13879999999999998), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/best@8/std': np.float64(0.12174840560950617), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/worst@8/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/worst@8/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/maj@8/mean': np.float64(0.027533333333333333), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/maj@8/std': np.float64(0.05736516116790047), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2024/acc/best@16/mean': np.float64(0.19306666666666666), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2024/acc/best@16/std': np.float64(0.10641941716877429), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/worst@16/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/acc/worst@16/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2024/acc/maj@16/mean': np.float64(0.0323), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2024/acc/maj@16/std': np.float64(0.050519292373600734), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/mean@16': np.float64(0.016666666666666666), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/std@16': np.float64(0.04159515113504068), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/best@2/mean': np.float64(0.030933333333333334), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/best@2/std': np.float64(0.05278365522501431), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/worst@2/mean': np.float64(0.0028), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/worst@2/std': np.float64(0.016608150639980467), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/maj@2/mean': np.float64(0.017433333333333335), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/maj@2/std': np.float64(0.04237131917479789), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/best@4/mean': np.float64(0.05346666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/best@4/std': np.float64(0.06034530524025951), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/worst@4/mean': np.float64(0.0001), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/worst@4/std': np.float64(0.0018230011885167093), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/maj@4/mean': np.float64(0.018033333333333335), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/maj@4/std': np.float64(0.04150854779233797), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/best@8/mean': np.float64(0.0799), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/best@8/std': np.float64(0.05830093217234193), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/worst@8/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/worst@8/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/maj@8/mean': np.float64(0.0209), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/maj@8/std': np.float64(0.040250869777013205), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/best@16/mean': np.float64(0.10443333333333334), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/best@16/std': np.float64(0.04553944033439126), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/worst@16/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/worst@16/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/maj@16/mean': np.float64(0.0239), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/reward/maj@16/std': np.float64(0.03468903903525698), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/mean@16': np.float64(0.016666666666666666), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/std@16': np.float64(0.04159515113504068), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/best@2/mean': np.float64(0.030933333333333334), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/best@2/std': np.float64(0.05278365522501431), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/worst@2/mean': np.float64(0.0028), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/worst@2/std': np.float64(0.016608150639980467), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/maj@2/mean': np.float64(0.017433333333333335), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/maj@2/std': np.float64(0.04237131917479789), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/best@4/mean': np.float64(0.05346666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/best@4/std': np.float64(0.06034530524025951), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/worst@4/mean': np.float64(0.0001), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/worst@4/std': np.float64(0.0018230011885167093), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/maj@4/mean': np.float64(0.018033333333333335), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/maj@4/std': np.float64(0.04150854779233797), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/best@8/mean': np.float64(0.0799), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/best@8/std': np.float64(0.05830093217234193), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/worst@8/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/worst@8/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/maj@8/mean': np.float64(0.0209), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/maj@8/std': np.float64(0.040250869777013205), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/best@16/mean': np.float64(0.10443333333333334), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/best@16/std': np.float64(0.04553944033439126), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/worst@16/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/worst@16/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/maj@16/mean': np.float64(0.0239), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/score/maj@16/std': np.float64(0.03468903903525698), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2025/acc/mean@16': np.float64(0.016666666666666666), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/std@16': np.float64(0.04159515113504068), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/best@2/mean': np.float64(0.030933333333333334), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/best@2/std': np.float64(0.05278365522501431), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/worst@2/mean': np.float64(0.0028), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/worst@2/std': np.float64(0.016608150639980467), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/maj@2/mean': np.float64(0.017433333333333335), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/maj@2/std': np.float64(0.04237131917479789), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/best@4/mean': np.float64(0.05346666666666667), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/best@4/std': np.float64(0.06034530524025951), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/worst@4/mean': np.float64(0.0001), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/worst@4/std': np.float64(0.0018230011885167093), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/maj@4/mean': np.float64(0.018033333333333335), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/maj@4/std': np.float64(0.04150854779233797), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/best@8/mean': np.float64(0.0799), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/best@8/std': np.float64(0.05830093217234193), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/worst@8/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/worst@8/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/maj@8/mean': np.float64(0.0209), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/maj@8/std': np.float64(0.040250869777013205), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2025/acc/best@16/mean': np.float64(0.10443333333333334), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2025/acc/best@16/std': np.float64(0.04553944033439126), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/worst@16/mean': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/acc/worst@16/std': np.float64(0.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2025/acc/maj@16/mean': np.float64(0.0239), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/aime2025/acc/maj@16/std': np.float64(0.03468903903525698), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/mean@4': np.float64(0.336), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/std@4': np.float64(0.25700892833418104), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/best@2/mean': np.float64(0.452158), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/best@2/std': np.float64(0.2367017636032033), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/worst@2/mean': np.float64(0.220298), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/worst@2/std': np.float64(0.20553663270030975), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/maj@2/mean': np.float64(0.33651), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/maj@2/std': np.float64(0.25708153745295514), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/best@4/mean': np.float64(0.554798), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/best@4/std': np.float64(0.1715609027698682), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/worst@4/mean': np.float64(0.1397), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/worst@4/std': np.float64(0.1205417649724046), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/maj@4/mean': np.float64(0.342432), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/reward/maj@4/std': np.float64(0.2370124781902772), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/mean@4': np.float64(0.336), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/std@4': np.float64(0.25700892833418104), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/best@2/mean': np.float64(0.452158), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/best@2/std': np.float64(0.2367017636032033), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/worst@2/mean': np.float64(0.220298), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/worst@2/std': np.float64(0.20553663270030975), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/maj@2/mean': np.float64(0.33651), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/maj@2/std': np.float64(0.25708153745295514), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/best@4/mean': np.float64(0.554798), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/best@4/std': np.float64(0.1715609027698682), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/worst@4/mean': np.float64(0.1397), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/worst@4/std': np.float64(0.1205417649724046), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/maj@4/mean': np.float64(0.342432), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/score/maj@4/std': np.float64(0.2370124781902772), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/math500/acc/mean@4': np.float64(0.336), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/std@4': np.float64(0.25700892833418104), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/best@2/mean': np.float64(0.452158), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/best@2/std': np.float64(0.2367017636032033), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/worst@2/mean': np.float64(0.220298), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/worst@2/std': np.float64(0.20553663270030975), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/maj@2/mean': np.float64(0.33651), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/maj@2/std': np.float64(0.25708153745295514), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/math500/acc/best@4/mean': np.float64(0.554798), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/math500/acc/best@4/std': np.float64(0.1715609027698682), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/worst@4/mean': np.float64(0.1397), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/acc/worst@4/std': np.float64(0.1205417649724046), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/math500/acc/maj@4/mean': np.float64(0.342432), "
[36m(TaskRunner pid=2823680)[0m  "'val-core/math500/acc/maj@4/std': np.float64(0.2370124781902772), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/num_turns/min': np.int32(2), 'val-aux/num_turns/max': np.int32(2), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/num_turns/mean': np.float64(2.0), "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/response_length/clip_ratio': 0.11689189189189189, "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2024/response_length/clip_ratio': 0.19375, "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/aime2025/response_length/clip_ratio': 0.16875, "
[36m(TaskRunner pid=2823680)[0m  "'val-aux/math500/response_length/clip_ratio': 0.086}")
[36m(TaskRunner pid=2823680)[0m step:0 - val-aux/aime2024/reward/mean@16:np.float64(0.027083333333333334) - val-aux/aime2024/reward/std@16:np.float64(0.07585646105399668) - val-aux/aime2024/reward/best@2/mean:np.float64(0.05056666666666667) - val-aux/aime2024/reward/best@2/std:np.float64(0.09832020110727895) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.004266666666666667) - val-aux/aime2024/reward/worst@2/std:np.float64(0.026802003333333345) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.026600000000000002) - val-aux/aime2024/reward/maj@2/std:np.float64(0.07612171202369464) - val-aux/aime2024/reward/best@4/mean:np.float64(0.0888) - val-aux/aime2024/reward/best@4/std:np.float64(0.11694868915101046) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.0002666666666666667) - val-aux/aime2024/reward/worst@4/std:np.float64(0.0038326539548979013) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.02583333333333333) - val-aux/aime2024/reward/maj@4/std:np.float64(0.06797213319569753) - val-aux/aime2024/reward/best@8/mean:np.float64(0.13879999999999998) - val-aux/aime2024/reward/best@8/std:np.float64(0.12174840560950617) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0) - val-aux/aime2024/reward/worst@8/std:np.float64(0.0) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.027533333333333333) - val-aux/aime2024/reward/maj@8/std:np.float64(0.05736516116790047) - val-aux/aime2024/reward/best@16/mean:np.float64(0.19306666666666666) - val-aux/aime2024/reward/best@16/std:np.float64(0.10641941716877429) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.0323) - val-aux/aime2024/reward/maj@16/std:np.float64(0.050519292373600734) - val-aux/aime2024/score/mean@16:np.float64(0.027083333333333334) - val-aux/aime2024/score/std@16:np.float64(0.07585646105399668) - val-aux/aime2024/score/best@2/mean:np.float64(0.05056666666666667) - val-aux/aime2024/score/best@2/std:np.float64(0.09832020110727895) - val-aux/aime2024/score/worst@2/mean:np.float64(0.004266666666666667) - val-aux/aime2024/score/worst@2/std:np.float64(0.026802003333333345) - val-aux/aime2024/score/maj@2/mean:np.float64(0.026600000000000002) - val-aux/aime2024/score/maj@2/std:np.float64(0.07612171202369464) - val-aux/aime2024/score/best@4/mean:np.float64(0.0888) - val-aux/aime2024/score/best@4/std:np.float64(0.11694868915101046) - val-aux/aime2024/score/worst@4/mean:np.float64(0.0002666666666666667) - val-aux/aime2024/score/worst@4/std:np.float64(0.0038326539548979013) - val-aux/aime2024/score/maj@4/mean:np.float64(0.02583333333333333) - val-aux/aime2024/score/maj@4/std:np.float64(0.06797213319569753) - val-aux/aime2024/score/best@8/mean:np.float64(0.13879999999999998) - val-aux/aime2024/score/best@8/std:np.float64(0.12174840560950617) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0) - val-aux/aime2024/score/worst@8/std:np.float64(0.0) - val-aux/aime2024/score/maj@8/mean:np.float64(0.027533333333333333) - val-aux/aime2024/score/maj@8/std:np.float64(0.05736516116790047) - val-aux/aime2024/score/best@16/mean:np.float64(0.19306666666666666) - val-aux/aime2024/score/best@16/std:np.float64(0.10641941716877429) - val-aux/aime2024/score/worst@16/mean:np.float64(0.0) - val-aux/aime2024/score/worst@16/std:np.float64(0.0) - val-aux/aime2024/score/maj@16/mean:np.float64(0.0323) - val-aux/aime2024/score/maj@16/std:np.float64(0.050519292373600734) - val-core/aime2024/acc/mean@16:np.float64(0.027083333333333334) - val-aux/aime2024/acc/std@16:np.float64(0.07585646105399668) - val-aux/aime2024/acc/best@2/mean:np.float64(0.05056666666666667) - val-aux/aime2024/acc/best@2/std:np.float64(0.09832020110727895) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.004266666666666667) - val-aux/aime2024/acc/worst@2/std:np.float64(0.026802003333333345) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.026600000000000002) - val-aux/aime2024/acc/maj@2/std:np.float64(0.07612171202369464) - val-aux/aime2024/acc/best@4/mean:np.float64(0.0888) - val-aux/aime2024/acc/best@4/std:np.float64(0.11694868915101046) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.0002666666666666667) - val-aux/aime2024/acc/worst@4/std:np.float64(0.0038326539548979013) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.02583333333333333) - val-aux/aime2024/acc/maj@4/std:np.float64(0.06797213319569753) - val-aux/aime2024/acc/best@8/mean:np.float64(0.13879999999999998) - val-aux/aime2024/acc/best@8/std:np.float64(0.12174840560950617) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0) - val-aux/aime2024/acc/worst@8/std:np.float64(0.0) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.027533333333333333) - val-aux/aime2024/acc/maj@8/std:np.float64(0.05736516116790047) - val-core/aime2024/acc/best@16/mean:np.float64(0.19306666666666666) - val-core/aime2024/acc/best@16/std:np.float64(0.10641941716877429) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0) - val-core/aime2024/acc/maj@16/mean:np.float64(0.0323) - val-core/aime2024/acc/maj@16/std:np.float64(0.050519292373600734) - val-aux/aime2025/reward/mean@16:np.float64(0.016666666666666666) - val-aux/aime2025/reward/std@16:np.float64(0.04159515113504068) - val-aux/aime2025/reward/best@2/mean:np.float64(0.030933333333333334) - val-aux/aime2025/reward/best@2/std:np.float64(0.05278365522501431) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0028) - val-aux/aime2025/reward/worst@2/std:np.float64(0.016608150639980467) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.017433333333333335) - val-aux/aime2025/reward/maj@2/std:np.float64(0.04237131917479789) - val-aux/aime2025/reward/best@4/mean:np.float64(0.05346666666666667) - val-aux/aime2025/reward/best@4/std:np.float64(0.06034530524025951) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0001) - val-aux/aime2025/reward/worst@4/std:np.float64(0.0018230011885167093) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.018033333333333335) - val-aux/aime2025/reward/maj@4/std:np.float64(0.04150854779233797) - val-aux/aime2025/reward/best@8/mean:np.float64(0.0799) - val-aux/aime2025/reward/best@8/std:np.float64(0.05830093217234193) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.0209) - val-aux/aime2025/reward/maj@8/std:np.float64(0.040250869777013205) - val-aux/aime2025/reward/best@16/mean:np.float64(0.10443333333333334) - val-aux/aime2025/reward/best@16/std:np.float64(0.04553944033439126) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.0239) - val-aux/aime2025/reward/maj@16/std:np.float64(0.03468903903525698) - val-aux/aime2025/score/mean@16:np.float64(0.016666666666666666) - val-aux/aime2025/score/std@16:np.float64(0.04159515113504068) - val-aux/aime2025/score/best@2/mean:np.float64(0.030933333333333334) - val-aux/aime2025/score/best@2/std:np.float64(0.05278365522501431) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0028) - val-aux/aime2025/score/worst@2/std:np.float64(0.016608150639980467) - val-aux/aime2025/score/maj@2/mean:np.float64(0.017433333333333335) - val-aux/aime2025/score/maj@2/std:np.float64(0.04237131917479789) - val-aux/aime2025/score/best@4/mean:np.float64(0.05346666666666667) - val-aux/aime2025/score/best@4/std:np.float64(0.06034530524025951) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0001) - val-aux/aime2025/score/worst@4/std:np.float64(0.0018230011885167093) - val-aux/aime2025/score/maj@4/mean:np.float64(0.018033333333333335) - val-aux/aime2025/score/maj@4/std:np.float64(0.04150854779233797) - val-aux/aime2025/score/best@8/mean:np.float64(0.0799) - val-aux/aime2025/score/best@8/std:np.float64(0.05830093217234193) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0) - val-aux/aime2025/score/worst@8/std:np.float64(0.0) - val-aux/aime2025/score/maj@8/mean:np.float64(0.0209) - val-aux/aime2025/score/maj@8/std:np.float64(0.040250869777013205) - val-aux/aime2025/score/best@16/mean:np.float64(0.10443333333333334) - val-aux/aime2025/score/best@16/std:np.float64(0.04553944033439126) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.0239) - val-aux/aime2025/score/maj@16/std:np.float64(0.03468903903525698) - val-core/aime2025/acc/mean@16:np.float64(0.016666666666666666) - val-aux/aime2025/acc/std@16:np.float64(0.04159515113504068) - val-aux/aime2025/acc/best@2/mean:np.float64(0.030933333333333334) - val-aux/aime2025/acc/best@2/std:np.float64(0.05278365522501431) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0028) - val-aux/aime2025/acc/worst@2/std:np.float64(0.016608150639980467) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.017433333333333335) - val-aux/aime2025/acc/maj@2/std:np.float64(0.04237131917479789) - val-aux/aime2025/acc/best@4/mean:np.float64(0.05346666666666667) - val-aux/aime2025/acc/best@4/std:np.float64(0.06034530524025951) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0001) - val-aux/aime2025/acc/worst@4/std:np.float64(0.0018230011885167093) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.018033333333333335) - val-aux/aime2025/acc/maj@4/std:np.float64(0.04150854779233797) - val-aux/aime2025/acc/best@8/mean:np.float64(0.0799) - val-aux/aime2025/acc/best@8/std:np.float64(0.05830093217234193) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.0209) - val-aux/aime2025/acc/maj@8/std:np.float64(0.040250869777013205) - val-core/aime2025/acc/best@16/mean:np.float64(0.10443333333333334) - val-core/aime2025/acc/best@16/std:np.float64(0.04553944033439126) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.0239) - val-core/aime2025/acc/maj@16/std:np.float64(0.03468903903525698) - val-aux/math500/reward/mean@4:np.float64(0.336) - val-aux/math500/reward/std@4:np.float64(0.25700892833418104) - val-aux/math500/reward/best@2/mean:np.float64(0.452158) - val-aux/math500/reward/best@2/std:np.float64(0.2367017636032033) - val-aux/math500/reward/worst@2/mean:np.float64(0.220298) - val-aux/math500/reward/worst@2/std:np.float64(0.20553663270030975) - val-aux/math500/reward/maj@2/mean:np.float64(0.33651) - val-aux/math500/reward/maj@2/std:np.float64(0.25708153745295514) - val-aux/math500/reward/best@4/mean:np.float64(0.554798) - val-aux/math500/reward/best@4/std:np.float64(0.1715609027698682) - val-aux/math500/reward/worst@4/mean:np.float64(0.1397) - val-aux/math500/reward/worst@4/std:np.float64(0.1205417649724046) - val-aux/math500/reward/maj@4/mean:np.float64(0.342432) - val-aux/math500/reward/maj@4/std:np.float64(0.2370124781902772) - val-aux/math500/score/mean@4:np.float64(0.336) - val-aux/math500/score/std@4:np.float64(0.25700892833418104) - val-aux/math500/score/best@2/mean:np.float64(0.452158) - val-aux/math500/score/best@2/std:np.float64(0.2367017636032033) - val-aux/math500/score/worst@2/mean:np.float64(0.220298) - val-aux/math500/score/worst@2/std:np.float64(0.20553663270030975) - val-aux/math500/score/maj@2/mean:np.float64(0.33651) - val-aux/math500/score/maj@2/std:np.float64(0.25708153745295514) - val-aux/math500/score/best@4/mean:np.float64(0.554798) - val-aux/math500/score/best@4/std:np.float64(0.1715609027698682) - val-aux/math500/score/worst@4/mean:np.float64(0.1397) - val-aux/math500/score/worst@4/std:np.float64(0.1205417649724046) - val-aux/math500/score/maj@4/mean:np.float64(0.342432) - val-aux/math500/score/maj@4/std:np.float64(0.2370124781902772) - val-core/math500/acc/mean@4:np.float64(0.336) - val-aux/math500/acc/std@4:np.float64(0.25700892833418104) - val-aux/math500/acc/best@2/mean:np.float64(0.452158) - val-aux/math500/acc/best@2/std:np.float64(0.2367017636032033) - val-aux/math500/acc/worst@2/mean:np.float64(0.220298) - val-aux/math500/acc/worst@2/std:np.float64(0.20553663270030975) - val-aux/math500/acc/maj@2/mean:np.float64(0.33651) - val-aux/math500/acc/maj@2/std:np.float64(0.25708153745295514) - val-core/math500/acc/best@4/mean:np.float64(0.554798) - val-core/math500/acc/best@4/std:np.float64(0.1715609027698682) - val-aux/math500/acc/worst@4/mean:np.float64(0.1397) - val-aux/math500/acc/worst@4/std:np.float64(0.1205417649724046) - val-core/math500/acc/maj@4/mean:np.float64(0.342432) - val-core/math500/acc/maj@4/std:np.float64(0.2370124781902772) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.11689189189189189 - val-aux/aime2024/response_length/clip_ratio:0.19375 - val-aux/aime2025/response_length/clip_ratio:0.16875 - val-aux/math500/response_length/clip_ratio:0.086
[36m(TaskRunner pid=2823680)[0m step:0 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:0.0 - frontier/mean_score:2.0 - frontier/mean_frontier_pct:0.0 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:0.0 - frontier/batch_hard_count:0.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:0.0 - cluster/prob_snapshot/cluster_0:0.015625 - cluster/prob_snapshot/cluster_1:0.015625 - cluster/prob_snapshot/cluster_2:0.015625 - cluster/prob_snapshot/cluster_3:0.015625 - cluster/prob_snapshot/cluster_4:0.015625 - cluster/prob_snapshot/cluster_5:0.015625 - cluster/prob_snapshot/cluster_6:0.015625 - cluster/prob_snapshot/cluster_7:0.015625 - cluster/prob_snapshot/cluster_8:0.015625 - cluster/prob_snapshot/cluster_9:0.015625 - cluster/prob_snapshot/cluster_10:0.015625 - cluster/prob_snapshot/cluster_11:0.015625 - cluster/prob_snapshot/cluster_12:0.015625 - cluster/prob_snapshot/cluster_13:0.015625 - cluster/prob_snapshot/cluster_14:0.015625 - cluster/prob_snapshot/cluster_15:0.015625 - cluster/prob_snapshot/cluster_16:0.015625 - cluster/prob_snapshot/cluster_17:0.015625 - cluster/prob_snapshot/cluster_18:0.015625 - cluster/prob_snapshot/cluster_19:0.015625 - cluster/prob_snapshot/cluster_20:0.015625 - cluster/prob_snapshot/cluster_21:0.015625 - cluster/prob_snapshot/cluster_22:0.015625 - cluster/prob_snapshot/cluster_23:0.015625 - cluster/prob_snapshot/cluster_24:0.015625 - cluster/prob_snapshot/cluster_25:0.015625 - cluster/prob_snapshot/cluster_26:0.015625 - cluster/prob_snapshot/cluster_27:0.015625 - cluster/prob_snapshot/cluster_28:0.015625 - cluster/prob_snapshot/cluster_29:0.015625 - cluster/prob_snapshot/cluster_30:0.015625 - cluster/prob_snapshot/cluster_31:0.015625 - cluster/prob_snapshot/cluster_32:0.015625 - cluster/prob_snapshot/cluster_33:0.015625 - cluster/prob_snapshot/cluster_34:0.015625 - cluster/prob_snapshot/cluster_35:0.015625 - cluster/prob_snapshot/cluster_36:0.015625 - cluster/prob_snapshot/cluster_37:0.015625 - cluster/prob_snapshot/cluster_38:0.015625 - cluster/prob_snapshot/cluster_39:0.015625 - cluster/prob_snapshot/cluster_40:0.015625 - cluster/prob_snapshot/cluster_41:0.015625 - cluster/prob_snapshot/cluster_42:0.015625 - cluster/prob_snapshot/cluster_43:0.015625 - cluster/prob_snapshot/cluster_44:0.015625 - cluster/prob_snapshot/cluster_45:0.015625 - cluster/prob_snapshot/cluster_46:0.015625 - cluster/prob_snapshot/cluster_47:0.015625 - cluster/prob_snapshot/cluster_48:0.015625 - cluster/prob_snapshot/cluster_49:0.015625 - cluster/prob_snapshot/cluster_50:0.015625 - cluster/prob_snapshot/cluster_51:0.015625 - cluster/prob_snapshot/cluster_52:0.015625 - cluster/prob_snapshot/cluster_53:0.015625 - cluster/prob_snapshot/cluster_54:0.015625 - cluster/prob_snapshot/cluster_55:0.015625 - cluster/prob_snapshot/cluster_56:0.015625 - cluster/prob_snapshot/cluster_57:0.015625 - cluster/prob_snapshot/cluster_58:0.015625 - cluster/prob_snapshot/cluster_59:0.015625 - cluster/prob_snapshot/cluster_60:0.015625 - cluster/prob_snapshot/cluster_61:0.015625 - cluster/prob_snapshot/cluster_62:0.015625 - cluster/prob_snapshot/cluster_63:0.015625
[36m(TaskRunner pid=2823680)[0m Training Progress:   0%|          | 0/800 [00:00<?, ?it/s]
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m Training Progress:   0%|          | 1/800 [01:31<20:12:35, 91.06s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:1 - global_seqlen/min:307758 - global_seqlen/max:374265 - global_seqlen/minmax_diff:66507 - global_seqlen/balanced_min:351836 - global_seqlen/balanced_max:351950 - global_seqlen/mean:351910.5 - frontier/skipped_zero_acc_count:84.0 - actor/entropy:np.float64(1.218673843551766) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.051303990359883755) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(8.392640459285097e-05) - actor/ppo_kl:np.float64(-3.2063081885202e-05) - actor/pg_clipfrac_lower:np.float64(7.323818532644178e-07) - actor/grad_norm:np.float64(0.3113647773861885) - perf/mfu/actor:np.float64(0.3007753157665214) - perf/max_memory_allocated_gb:np.float64(40.732741355895996) - perf/max_memory_reserved_gb:np.float64(49.15234375) - perf/cpu_memory_used_gb:np.float64(103.96915435791016) - actor/lr:np.float64(1e-06) - training/global_step:1 - training/epoch:0 - critic/score/mean:0.1931818127632141 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.1931818127632141 - critic/rewards/max:1.0 - critic/rewards/min:0.0 - critic/advantages/mean:-0.010107602924108505 - critic/advantages/max:2.4748666286468506 - critic/advantages/min:-1.6201815605163574 - critic/returns/mean:-0.010107602924108505 - critic/returns/max:2.4748666286468506 - critic/returns/min:-1.6201815605163574 - response_length/mean:1059.7869873046875 - response_length/max:8192.0 - response_length/min:26.0 - response_length/clip_ratio:0.008522727526724339 - response_length_non_aborted/mean:1059.7869873046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:26.0 - response_length_non_aborted/clip_ratio:0.008522727526724339 - response/aborted_ratio:0.0 - prompt_length/mean:250.81817626953125 - prompt_length/max:667.0 - prompt_length/min:189.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.169002830982208e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.21355837024748325) - timing_s/agent_loop/generate_sequences/max:np.float64(27.601516486145556) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.54047199469278) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.601516486145556) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.17371532227844 - timing_s/reward:5.481019616127014e-05 - timing_s/old_log_prob:9.499120791442692 - timing_s/ref:14.60401158966124 - timing_s/adv:0.0355815626680851 - timing_s/update_actor:13.648491045460105 - timing_s/update_weights:22.5630676811561 - timing_s/step:89.88430902734399 - timing_s/stop_profile:5.7423487305641174e-05 - timing_per_token_ms/adv:7.712772047108075e-05 - timing_per_token_ms/update_actor:0.029584900810174224 - timing_per_token_ms/gen:0.07820427916813907 - timing_per_token_ms/ref:0.03165611735917708 - perf/total_num_tokens:1407642 - perf/time_per_step:89.88430902734399 - perf/throughput:3915.149416044843 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:84.0 - frontier/mean_score:1.9249999999999998 - frontier/mean_frontier_pct:0.024190347874336122 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:0.0 - frontier/batch_hard_count:16.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:1.7 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:1.7 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:1.7 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:1.7 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:1.0 - cluster/prob_snapshot/cluster_0:0.016233766233766236 - cluster/prob_snapshot/cluster_1:0.016233766233766236 - cluster/prob_snapshot/cluster_2:0.016233766233766236 - cluster/prob_snapshot/cluster_3:0.0137987012987013 - cluster/prob_snapshot/cluster_4:0.016233766233766236 - cluster/prob_snapshot/cluster_5:0.016233766233766236 - cluster/prob_snapshot/cluster_6:0.0137987012987013 - cluster/prob_snapshot/cluster_7:0.016233766233766236 - cluster/prob_snapshot/cluster_8:0.0137987012987013 - cluster/prob_snapshot/cluster_9:0.016233766233766236 - cluster/prob_snapshot/cluster_10:0.016233766233766236 - cluster/prob_snapshot/cluster_11:0.016233766233766236 - cluster/prob_snapshot/cluster_12:0.016233766233766236 - cluster/prob_snapshot/cluster_13:0.016233766233766236 - cluster/prob_snapshot/cluster_14:0.0137987012987013 - cluster/prob_snapshot/cluster_15:0.016233766233766236 - cluster/prob_snapshot/cluster_16:0.016233766233766236 - cluster/prob_snapshot/cluster_17:0.016233766233766236 - cluster/prob_snapshot/cluster_18:0.016233766233766236 - cluster/prob_snapshot/cluster_19:0.016233766233766236 - cluster/prob_snapshot/cluster_20:0.016233766233766236 - cluster/prob_snapshot/cluster_21:0.016233766233766236 - cluster/prob_snapshot/cluster_22:0.016233766233766236 - cluster/prob_snapshot/cluster_23:0.0137987012987013 - cluster/prob_snapshot/cluster_24:0.016233766233766236 - cluster/prob_snapshot/cluster_25:0.016233766233766236 - cluster/prob_snapshot/cluster_26:0.016233766233766236 - cluster/prob_snapshot/cluster_27:0.016233766233766236 - cluster/prob_snapshot/cluster_28:0.0137987012987013 - cluster/prob_snapshot/cluster_29:0.016233766233766236 - cluster/prob_snapshot/cluster_30:0.016233766233766236 - cluster/prob_snapshot/cluster_31:0.016233766233766236 - cluster/prob_snapshot/cluster_32:0.0137987012987013 - cluster/prob_snapshot/cluster_33:0.016233766233766236 - cluster/prob_snapshot/cluster_34:0.016233766233766236 - cluster/prob_snapshot/cluster_35:0.016233766233766236 - cluster/prob_snapshot/cluster_36:0.016233766233766236 - cluster/prob_snapshot/cluster_37:0.016233766233766236 - cluster/prob_snapshot/cluster_38:0.016233766233766236 - cluster/prob_snapshot/cluster_39:0.016233766233766236 - cluster/prob_snapshot/cluster_40:0.016233766233766236 - cluster/prob_snapshot/cluster_41:0.0137987012987013 - cluster/prob_snapshot/cluster_42:0.016233766233766236 - cluster/prob_snapshot/cluster_43:0.016233766233766236 - cluster/prob_snapshot/cluster_44:0.0137987012987013 - cluster/prob_snapshot/cluster_45:0.016233766233766236 - cluster/prob_snapshot/cluster_46:0.016233766233766236 - cluster/prob_snapshot/cluster_47:0.016233766233766236 - cluster/prob_snapshot/cluster_48:0.0137987012987013 - cluster/prob_snapshot/cluster_49:0.0137987012987013 - cluster/prob_snapshot/cluster_50:0.0137987012987013 - cluster/prob_snapshot/cluster_51:0.016233766233766236 - cluster/prob_snapshot/cluster_52:0.0137987012987013 - cluster/prob_snapshot/cluster_53:0.016233766233766236 - cluster/prob_snapshot/cluster_54:0.0137987012987013 - cluster/prob_snapshot/cluster_55:0.016233766233766236 - cluster/prob_snapshot/cluster_56:0.016233766233766236 - cluster/prob_snapshot/cluster_57:0.016233766233766236 - cluster/prob_snapshot/cluster_58:0.016233766233766236 - cluster/prob_snapshot/cluster_59:0.0137987012987013 - cluster/prob_snapshot/cluster_60:0.016233766233766236 - cluster/prob_snapshot/cluster_61:0.016233766233766236 - cluster/prob_snapshot/cluster_62:0.0137987012987013 - cluster/prob_snapshot/cluster_63:0.016233766233766236
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 11:33:55,127:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   0%|          | 2/800 [02:47<18:20:27, 82.74s/it]
[36m(TaskRunner pid=2823680)[0m step:2 - global_seqlen/min:311952 - global_seqlen/max:368579 - global_seqlen/minmax_diff:56627 - global_seqlen/balanced_min:332614 - global_seqlen/balanced_max:332764 - global_seqlen/mean:332697.25 - frontier/skipped_zero_acc_count:91.0 - actor/entropy:np.float64(1.0297418405350887) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.0001565921847941354 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06228092346646008) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00010157827062740628) - actor/ppo_kl:np.float64(1.8960741337985765e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.3413967669010162) - perf/mfu/actor:np.float64(0.33630315543739775) - perf/max_memory_allocated_gb:np.float64(41.27672815322876) - perf/max_memory_reserved_gb:np.float64(49.15234375) - perf/cpu_memory_used_gb:np.float64(107.0389633178711) - actor/lr:np.float64(1e-06) - training/global_step:2 - training/epoch:0 - critic/score/mean:0.2060810774564743 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.20607200264930725 - critic/rewards/max:1.0022047758102417 - critic/rewards/min:-0.0021450743079185486 - critic/advantages/mean:-0.04003187641501427 - critic/advantages/max:2.474865436553955 - critic/advantages/min:-2.4748640060424805 - critic/returns/mean:-0.04003187641501427 - critic/returns/max:2.474865436553955 - critic/returns/min:-2.4748640060424805 - response_length/mean:1016.5979614257812 - response_length/max:8192.0 - response_length/min:2.0 - response_length/clip_ratio:0.0033783784601837397 - response_length_non_aborted/mean:1016.5979614257812 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:2.0 - response_length_non_aborted/clip_ratio:0.0033783784601837397 - response/aborted_ratio:0.0 - prompt_length/mean:248.45945739746094 - prompt_length/max:381.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.778367191553116e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.12678392883390188) - timing_s/agent_loop/generate_sequences/max:np.float64(26.5291742477566) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.353725721891351) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(26.5291742477566) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:260 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.808032368309796 - timing_s/reward:9.775441139936447e-05 - timing_s/old_log_prob:7.654721691273153 - timing_s/ref:8.091793842613697 - timing_s/adv:0.0475091440603137 - timing_s/update_actor:11.48213972710073 - timing_s/update_weights:17.32431700080633 - timing_s/step:76.72243580501527 - timing_s/stop_profile:6.247684359550476e-05 - timing_per_token_ms/adv:0.00012687476548792974 - timing_per_token_ms/update_actor:0.03066343993329202 - timing_per_token_ms/gen:0.10570507877130532 - timing_per_token_ms/ref:0.02160940733545827 - perf/total_num_tokens:1330789 - perf/time_per_step:76.72243580501527 - perf/throughput:4336.374966581183 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:175.0 - frontier/mean_score:1.8528125 - frontier/mean_frontier_pct:0.054086394791004536 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:0.0 - frontier/batch_hard_count:16.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:1.7 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.7 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:1.7 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:1.7 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:1.7 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:1.7 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.7 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:1.7 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:1.49 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.7 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:1.7 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:2.0 - cluster/prob_snapshot/cluster_0:0.016866250632484397 - cluster/prob_snapshot/cluster_1:0.014336313037611738 - cluster/prob_snapshot/cluster_2:0.016866250632484397 - cluster/prob_snapshot/cluster_3:0.014336313037611738 - cluster/prob_snapshot/cluster_4:0.016866250632484397 - cluster/prob_snapshot/cluster_5:0.014336313037611738 - cluster/prob_snapshot/cluster_6:0.014336313037611738 - cluster/prob_snapshot/cluster_7:0.014336313037611738 - cluster/prob_snapshot/cluster_8:0.014336313037611738 - cluster/prob_snapshot/cluster_9:0.014336313037611738 - cluster/prob_snapshot/cluster_10:0.016866250632484397 - cluster/prob_snapshot/cluster_11:0.016866250632484397 - cluster/prob_snapshot/cluster_12:0.016866250632484397 - cluster/prob_snapshot/cluster_13:0.016866250632484397 - cluster/prob_snapshot/cluster_14:0.014336313037611738 - cluster/prob_snapshot/cluster_15:0.016866250632484397 - cluster/prob_snapshot/cluster_16:0.016866250632484397 - cluster/prob_snapshot/cluster_17:0.014336313037611738 - cluster/prob_snapshot/cluster_18:0.016866250632484397 - cluster/prob_snapshot/cluster_19:0.016866250632484397 - cluster/prob_snapshot/cluster_20:0.016866250632484397 - cluster/prob_snapshot/cluster_21:0.014336313037611738 - cluster/prob_snapshot/cluster_22:0.016866250632484397 - cluster/prob_snapshot/cluster_23:0.014336313037611738 - cluster/prob_snapshot/cluster_24:0.016866250632484397 - cluster/prob_snapshot/cluster_25:0.016866250632484397 - cluster/prob_snapshot/cluster_26:0.016866250632484397 - cluster/prob_snapshot/cluster_27:0.014336313037611738 - cluster/prob_snapshot/cluster_28:0.014336313037611738 - cluster/prob_snapshot/cluster_29:0.016866250632484397 - cluster/prob_snapshot/cluster_30:0.014336313037611738 - cluster/prob_snapshot/cluster_31:0.014336313037611738 - cluster/prob_snapshot/cluster_32:0.014336313037611738 - cluster/prob_snapshot/cluster_33:0.014336313037611738 - cluster/prob_snapshot/cluster_34:0.016866250632484397 - cluster/prob_snapshot/cluster_35:0.014336313037611738 - cluster/prob_snapshot/cluster_36:0.014336313037611738 - cluster/prob_snapshot/cluster_37:0.016866250632484397 - cluster/prob_snapshot/cluster_38:0.016866250632484397 - cluster/prob_snapshot/cluster_39:0.016866250632484397 - cluster/prob_snapshot/cluster_40:0.016866250632484397 - cluster/prob_snapshot/cluster_41:0.014336313037611738 - cluster/prob_snapshot/cluster_42:0.016866250632484397 - cluster/prob_snapshot/cluster_43:0.016866250632484397 - cluster/prob_snapshot/cluster_44:0.014336313037611738 - cluster/prob_snapshot/cluster_45:0.016866250632484397 - cluster/prob_snapshot/cluster_46:0.016866250632484397 - cluster/prob_snapshot/cluster_47:0.016866250632484397 - cluster/prob_snapshot/cluster_48:0.014336313037611738 - cluster/prob_snapshot/cluster_49:0.012565356721200877 - cluster/prob_snapshot/cluster_50:0.014336313037611738 - cluster/prob_snapshot/cluster_51:0.016866250632484397 - cluster/prob_snapshot/cluster_52:0.012565356721200877 - cluster/prob_snapshot/cluster_53:0.016866250632484397 - cluster/prob_snapshot/cluster_54:0.014336313037611738 - cluster/prob_snapshot/cluster_55:0.016866250632484397 - cluster/prob_snapshot/cluster_56:0.016866250632484397 - cluster/prob_snapshot/cluster_57:0.014336313037611738 - cluster/prob_snapshot/cluster_58:0.016866250632484397 - cluster/prob_snapshot/cluster_59:0.014336313037611738 - cluster/prob_snapshot/cluster_60:0.014336313037611738 - cluster/prob_snapshot/cluster_61:0.016866250632484397 - cluster/prob_snapshot/cluster_62:0.014336313037611738 - cluster/prob_snapshot/cluster_63:0.016866250632484397
[36m(TaskRunner pid=2823680)[0m Training Progress:   0%|          | 3/800 [04:10<18:17:48, 82.65s/it]
[36m(TaskRunner pid=2823680)[0m step:3 - global_seqlen/min:337215 - global_seqlen/max:364419 - global_seqlen/minmax_diff:27204 - global_seqlen/balanced_min:349092 - global_seqlen/balanced_max:349155 - global_seqlen/mean:349114.25 - frontier/skipped_zero_acc_count:79.0 - actor/entropy:np.float64(1.1244758370518684) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.00026104706921614707 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08011906401952729) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00017859543244412633) - actor/ppo_kl:np.float64(-5.680503563780803e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.3214464081185205) - perf/mfu/actor:np.float64(0.3337210331706566) - perf/max_memory_allocated_gb:np.float64(52.65984106063843) - perf/max_memory_reserved_gb:np.float64(61.890625) - perf/cpu_memory_used_gb:np.float64(110.8545150756836) - actor/lr:np.float64(1e-06) - training/global_step:3 - training/epoch:0 - critic/score/mean:0.20663265883922577 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.20652705430984497 - critic/rewards/max:1.0025725364685059 - critic/rewards/min:-0.00362599384970963 - critic/advantages/mean:-0.0945703461766243 - critic/advantages/max:2.4748661518096924 - critic/advantages/min:-1.2084728479385376 - critic/returns/mean:-0.0945703461766243 - critic/returns/max:2.4748661518096924 - critic/returns/min:-1.2084728479385376 - response_length/mean:1082.283203125 - response_length/max:8192.0 - response_length/min:2.0 - response_length/clip_ratio:0.010204081423580647 - response_length_non_aborted/mean:1082.283203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:2.0 - response_length_non_aborted/clip_ratio:0.010204081423580647 - response/aborted_ratio:0.0 - prompt_length/mean:235.89796447753906 - prompt_length/max:352.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.055288344621658e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.16519903019070625) - timing_s/agent_loop/generate_sequences/max:np.float64(27.38619049359113) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.482753465824317) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.38619049359113) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:186 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.423395794816315 - timing_s/reward:9.747222065925598e-05 - timing_s/old_log_prob:7.118991430848837 - timing_s/ref:11.47437506262213 - timing_s/adv:0.04784395173192024 - timing_s/update_actor:13.15414734557271 - timing_s/update_weights:20.796433073468506 - timing_s/step:82.35336195770651 - timing_s/stop_profile:5.7252123951911926e-05 - timing_per_token_ms/adv:9.259038473298326e-05 - timing_per_token_ms/update_actor:0.025456667341889838 - timing_per_token_ms/gen:0.06935309140685747 - timing_per_token_ms/ref:0.022205874790018964 - perf/total_num_tokens:1396457 - perf/time_per_step:82.35336195770651 - perf/throughput:4239.222804034297 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:254.0 - frontier/mean_score:1.78765625 - frontier/mean_frontier_pct:0.08154173284405208 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:0.0 - frontier/batch_hard_count:16.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:1.49 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.7 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:1.7 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:1.7 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:1.7 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.49 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:1.7 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:1.7 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:1.7 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:1.49 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:1.7 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:1.49 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.49 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.7 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:1.7 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:3.0 - cluster/prob_snapshot/cluster_0:0.0174809894240014 - cluster/prob_snapshot/cluster_1:0.014858841010401188 - cluster/prob_snapshot/cluster_2:0.014858841010401188 - cluster/prob_snapshot/cluster_3:0.013023337120881042 - cluster/prob_snapshot/cluster_4:0.014858841010401188 - cluster/prob_snapshot/cluster_5:0.014858841010401188 - cluster/prob_snapshot/cluster_6:0.013023337120881042 - cluster/prob_snapshot/cluster_7:0.014858841010401188 - cluster/prob_snapshot/cluster_8:0.014858841010401188 - cluster/prob_snapshot/cluster_9:0.014858841010401188 - cluster/prob_snapshot/cluster_10:0.0174809894240014 - cluster/prob_snapshot/cluster_11:0.014858841010401188 - cluster/prob_snapshot/cluster_12:0.0174809894240014 - cluster/prob_snapshot/cluster_13:0.0174809894240014 - cluster/prob_snapshot/cluster_14:0.014858841010401188 - cluster/prob_snapshot/cluster_15:0.014858841010401188 - cluster/prob_snapshot/cluster_16:0.0174809894240014 - cluster/prob_snapshot/cluster_17:0.014858841010401188 - cluster/prob_snapshot/cluster_18:0.0174809894240014 - cluster/prob_snapshot/cluster_19:0.0174809894240014 - cluster/prob_snapshot/cluster_20:0.014858841010401188 - cluster/prob_snapshot/cluster_21:0.014858841010401188 - cluster/prob_snapshot/cluster_22:0.0174809894240014 - cluster/prob_snapshot/cluster_23:0.013023337120881042 - cluster/prob_snapshot/cluster_24:0.014858841010401188 - cluster/prob_snapshot/cluster_25:0.0174809894240014 - cluster/prob_snapshot/cluster_26:0.014858841010401188 - cluster/prob_snapshot/cluster_27:0.014858841010401188 - cluster/prob_snapshot/cluster_28:0.014858841010401188 - cluster/prob_snapshot/cluster_29:0.0174809894240014 - cluster/prob_snapshot/cluster_30:0.014858841010401188 - cluster/prob_snapshot/cluster_31:0.014858841010401188 - cluster/prob_snapshot/cluster_32:0.013023337120881042 - cluster/prob_snapshot/cluster_33:0.013023337120881042 - cluster/prob_snapshot/cluster_34:0.0174809894240014 - cluster/prob_snapshot/cluster_35:0.014858841010401188 - cluster/prob_snapshot/cluster_36:0.013023337120881042 - cluster/prob_snapshot/cluster_37:0.0174809894240014 - cluster/prob_snapshot/cluster_38:0.0174809894240014 - cluster/prob_snapshot/cluster_39:0.0174809894240014 - cluster/prob_snapshot/cluster_40:0.0174809894240014 - cluster/prob_snapshot/cluster_41:0.014858841010401188 - cluster/prob_snapshot/cluster_42:0.0174809894240014 - cluster/prob_snapshot/cluster_43:0.0174809894240014 - cluster/prob_snapshot/cluster_44:0.014858841010401188 - cluster/prob_snapshot/cluster_45:0.0174809894240014 - cluster/prob_snapshot/cluster_46:0.0174809894240014 - cluster/prob_snapshot/cluster_47:0.0174809894240014 - cluster/prob_snapshot/cluster_48:0.014858841010401188 - cluster/prob_snapshot/cluster_49:0.013023337120881042 - cluster/prob_snapshot/cluster_50:0.014858841010401188 - cluster/prob_snapshot/cluster_51:0.0174809894240014 - cluster/prob_snapshot/cluster_52:0.013023337120881042 - cluster/prob_snapshot/cluster_53:0.014858841010401188 - cluster/prob_snapshot/cluster_54:0.013023337120881042 - cluster/prob_snapshot/cluster_55:0.014858841010401188 - cluster/prob_snapshot/cluster_56:0.0174809894240014 - cluster/prob_snapshot/cluster_57:0.014858841010401188 - cluster/prob_snapshot/cluster_58:0.0174809894240014 - cluster/prob_snapshot/cluster_59:0.014858841010401188 - cluster/prob_snapshot/cluster_60:0.014858841010401188 - cluster/prob_snapshot/cluster_61:0.0174809894240014 - cluster/prob_snapshot/cluster_62:0.014858841010401188 - cluster/prob_snapshot/cluster_63:0.0174809894240014
[36m(TaskRunner pid=2823680)[0m Training Progress:   0%|          | 4/800 [05:27<17:45:37, 80.32s/it]
[36m(TaskRunner pid=2823680)[0m step:4 - global_seqlen/min:335766 - global_seqlen/max:343754 - global_seqlen/minmax_diff:7988 - global_seqlen/balanced_min:338768 - global_seqlen/balanced_max:338868 - global_seqlen/mean:338829.25 - frontier/skipped_zero_acc_count:83.0 - actor/entropy:np.float64(0.9683397545114808) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0002055244258372113 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.03619265649467707) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021772364340469485) - actor/ppo_kl:np.float64(-1.4693175013067245e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.3352530747652054) - perf/mfu/actor:np.float64(0.3382416281058676) - perf/max_memory_allocated_gb:np.float64(52.65984106063843) - perf/max_memory_reserved_gb:np.float64(61.890625) - perf/cpu_memory_used_gb:np.float64(110.88787841796875) - actor/lr:np.float64(1e-06) - training/global_step:4 - training/epoch:0 - critic/score/mean:0.25833332538604736 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.2580924332141876 - critic/rewards/max:1.0018163919448853 - critic/rewards/min:-0.005291802808642387 - critic/advantages/mean:-0.09599535912275314 - critic/advantages/max:2.4748644828796387 - critic/advantages/min:-2.474865198135376 - critic/returns/mean:-0.09599535912275314 - critic/returns/max:2.4748644828796387 - critic/returns/min:-2.474865198135376 - response_length/mean:1054.925048828125 - response_length/max:8192.0 - response_length/min:5.0 - response_length/clip_ratio:0.0027777778450399637 - response_length_non_aborted/mean:1054.925048828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:5.0 - response_length_non_aborted/clip_ratio:0.0027777778450399637 - response/aborted_ratio:0.0 - prompt_length/mean:229.3111114501953 - prompt_length/max:381.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.683186024427414e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.5323486961424351) - timing_s/agent_loop/generate_sequences/max:np.float64(27.412411654368043) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.8114396865539675) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.412411654368043) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.91851318348199 - timing_s/reward:0.00012585148215293884 - timing_s/old_log_prob:5.677919765934348 - timing_s/ref:11.04783299472183 - timing_s/adv:0.03217011783272028 - timing_s/update_actor:11.640129011124372 - timing_s/update_weights:18.886287251487374 - timing_s/step:76.57042727898806 - timing_s/stop_profile:6.11012801527977e-05 - timing_per_token_ms/adv:6.958334036169422e-05 - timing_per_token_ms/update_actor:0.025177373084138588 - timing_per_token_ms/gen:0.07614683819935064 - timing_per_token_ms/ref:0.023896248298754835 - perf/total_num_tokens:1355317 - perf/time_per_step:76.57042727898806 - perf/throughput:4425.066726681036 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:337.0 - frontier/mean_score:1.7306562499999998 - frontier/mean_frontier_pct:0.10659806348455064 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:0.0 - frontier/batch_hard_count:16.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:1.7 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:1.343 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.7 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:1.49 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:1.7 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:1.7 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:1.49 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.49 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:1.7 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:1.7 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:1.7 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:1.49 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:1.7 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:1.7 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:1.343 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.7 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:1.343 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:1.49 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:1.7 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:1.7 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:4.0 - cluster/prob_snapshot/cluster_0:0.015348224120185624 - cluster/prob_snapshot/cluster_1:0.013452267022986224 - cluster/prob_snapshot/cluster_2:0.013452267022986224 - cluster/prob_snapshot/cluster_3:0.012125097054946644 - cluster/prob_snapshot/cluster_4:0.015348224120185624 - cluster/prob_snapshot/cluster_5:0.015348224120185624 - cluster/prob_snapshot/cluster_6:0.012125097054946644 - cluster/prob_snapshot/cluster_7:0.013452267022986224 - cluster/prob_snapshot/cluster_8:0.015348224120185624 - cluster/prob_snapshot/cluster_9:0.013452267022986224 - cluster/prob_snapshot/cluster_10:0.01805673425904191 - cluster/prob_snapshot/cluster_11:0.015348224120185624 - cluster/prob_snapshot/cluster_12:0.01805673425904191 - cluster/prob_snapshot/cluster_13:0.015348224120185624 - cluster/prob_snapshot/cluster_14:0.015348224120185624 - cluster/prob_snapshot/cluster_15:0.013452267022986224 - cluster/prob_snapshot/cluster_16:0.01805673425904191 - cluster/prob_snapshot/cluster_17:0.015348224120185624 - cluster/prob_snapshot/cluster_18:0.01805673425904191 - cluster/prob_snapshot/cluster_19:0.01805673425904191 - cluster/prob_snapshot/cluster_20:0.015348224120185624 - cluster/prob_snapshot/cluster_21:0.015348224120185624 - cluster/prob_snapshot/cluster_22:0.01805673425904191 - cluster/prob_snapshot/cluster_23:0.013452267022986224 - cluster/prob_snapshot/cluster_24:0.015348224120185624 - cluster/prob_snapshot/cluster_25:0.01805673425904191 - cluster/prob_snapshot/cluster_26:0.015348224120185624 - cluster/prob_snapshot/cluster_27:0.015348224120185624 - cluster/prob_snapshot/cluster_28:0.015348224120185624 - cluster/prob_snapshot/cluster_29:0.01805673425904191 - cluster/prob_snapshot/cluster_30:0.015348224120185624 - cluster/prob_snapshot/cluster_31:0.015348224120185624 - cluster/prob_snapshot/cluster_32:0.013452267022986224 - cluster/prob_snapshot/cluster_33:0.013452267022986224 - cluster/prob_snapshot/cluster_34:0.01805673425904191 - cluster/prob_snapshot/cluster_35:0.015348224120185624 - cluster/prob_snapshot/cluster_36:0.013452267022986224 - cluster/prob_snapshot/cluster_37:0.01805673425904191 - cluster/prob_snapshot/cluster_38:0.01805673425904191 - cluster/prob_snapshot/cluster_39:0.015348224120185624 - cluster/prob_snapshot/cluster_40:0.01805673425904191 - cluster/prob_snapshot/cluster_41:0.015348224120185624 - cluster/prob_snapshot/cluster_42:0.015348224120185624 - cluster/prob_snapshot/cluster_43:0.01805673425904191 - cluster/prob_snapshot/cluster_44:0.015348224120185624 - cluster/prob_snapshot/cluster_45:0.01805673425904191 - cluster/prob_snapshot/cluster_46:0.01805673425904191 - cluster/prob_snapshot/cluster_47:0.01805673425904191 - cluster/prob_snapshot/cluster_48:0.015348224120185624 - cluster/prob_snapshot/cluster_49:0.012125097054946644 - cluster/prob_snapshot/cluster_50:0.015348224120185624 - cluster/prob_snapshot/cluster_51:0.015348224120185624 - cluster/prob_snapshot/cluster_52:0.013452267022986224 - cluster/prob_snapshot/cluster_53:0.015348224120185624 - cluster/prob_snapshot/cluster_54:0.012125097054946644 - cluster/prob_snapshot/cluster_55:0.015348224120185624 - cluster/prob_snapshot/cluster_56:0.01805673425904191 - cluster/prob_snapshot/cluster_57:0.013452267022986224 - cluster/prob_snapshot/cluster_58:0.015348224120185624 - cluster/prob_snapshot/cluster_59:0.015348224120185624 - cluster/prob_snapshot/cluster_60:0.015348224120185624 - cluster/prob_snapshot/cluster_61:0.01805673425904191 - cluster/prob_snapshot/cluster_62:0.015348224120185624 - cluster/prob_snapshot/cluster_63:0.01805673425904191
[36m(TaskRunner pid=2823680)[0m Training Progress:   1%|          | 5/800 [06:52<18:08:26, 82.15s/it]
[36m(TaskRunner pid=2823680)[0m step:5 - global_seqlen/min:303153 - global_seqlen/max:392411 - global_seqlen/minmax_diff:89258 - global_seqlen/balanced_min:331957 - global_seqlen/balanced_max:332048 - global_seqlen/mean:331997.0 - frontier/skipped_zero_acc_count:63.0 - actor/entropy:np.float64(1.064045929321737) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00025945223751477897 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05501608721442608) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006061416517530398) - actor/ppo_kl:np.float64(0.00021647771781848255) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.3303295390473472) - perf/mfu/actor:np.float64(0.18371658384576908) - perf/max_memory_allocated_gb:np.float64(52.65984106063843) - perf/max_memory_reserved_gb:np.float64(61.890625) - perf/cpu_memory_used_gb:np.float64(111.3106460571289) - actor/lr:np.float64(1e-06) - training/global_step:5 - training/epoch:0 - critic/score/mean:0.26538461446762085 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.26501592993736267 - critic/rewards/max:1.0027515888214111 - critic/rewards/min:-0.005677666049450636 - critic/advantages/mean:-0.07751592993736267 - critic/advantages/max:2.474864959716797 - critic/advantages/min:-2.474862575531006 - critic/returns/mean:-0.07751592993736267 - critic/returns/max:2.474864959716797 - critic/returns/min:-2.474862575531006 - response_length/mean:987.201904296875 - response_length/max:8192.0 - response_length/min:5.0 - response_length/clip_ratio:0.003846153849735856 - response_length_non_aborted/mean:987.201904296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:5.0 - response_length_non_aborted/clip_ratio:0.003846153849735856 - response/aborted_ratio:0.0 - prompt_length/mean:238.73846435546875 - prompt_length/max:508.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.830094546079636e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.21434118691831827) - timing_s/agent_loop/generate_sequences/max:np.float64(27.45843897946179) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.328019829817094) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.45843897946179) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:184 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.899120338261127 - timing_s/reward:0.00010616891086101532 - timing_s/old_log_prob:7.604341987520456 - timing_s/ref:6.899024096317589 - timing_s/adv:0.05220412742346525 - timing_s/update_actor:21.011088654398918 - timing_s/update_weights:20.36460952460766 - timing_s/step:85.19605164602399 - timing_s/stop_profile:5.664769560098648e-05 - timing_per_token_ms/adv:8.189024033899448e-05 - timing_per_token_ms/update_actor:0.03295913914498747 - timing_per_token_ms/gen:0.05629570822402308 - timing_per_token_ms/ref:0.01082218531820563 - perf/total_num_tokens:1327988 - perf/time_per_step:85.19605164602399 - perf/throughput:3896.858992707721 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:400.0 - frontier/mean_score:1.6726437499999998 - frontier/mean_frontier_pct:0.13263508652273692 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:0.0 - frontier/batch_hard_count:16.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:1.7 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:1.2401 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:1.49 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:1.7 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:1.7 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:1.7 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:1.343 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:1.7 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.7 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.49 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:1.7 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:1.7 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:1.49 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.49 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:1.49 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:1.7 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:1.49 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:1.7 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:1.7 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:1.49 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:1.7 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:1.343 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.7 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.2401 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:1.49 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:1.49 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:1.7 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:5.0 - cluster/prob_snapshot/cluster_0:0.015880548383360175 - cluster/prob_snapshot/cluster_1:0.013918833583062742 - cluster/prob_snapshot/cluster_2:0.013918833583062742 - cluster/prob_snapshot/cluster_3:0.011584392970708797 - cluster/prob_snapshot/cluster_4:0.015880548383360175 - cluster/prob_snapshot/cluster_5:0.013918833583062742 - cluster/prob_snapshot/cluster_6:0.012545633222854539 - cluster/prob_snapshot/cluster_7:0.013918833583062742 - cluster/prob_snapshot/cluster_8:0.015880548383360175 - cluster/prob_snapshot/cluster_9:0.013918833583062742 - cluster/prob_snapshot/cluster_10:0.015880548383360175 - cluster/prob_snapshot/cluster_11:0.015880548383360175 - cluster/prob_snapshot/cluster_12:0.015880548383360175 - cluster/prob_snapshot/cluster_13:0.015880548383360175 - cluster/prob_snapshot/cluster_14:0.015880548383360175 - cluster/prob_snapshot/cluster_15:0.012545633222854539 - cluster/prob_snapshot/cluster_16:0.015880548383360175 - cluster/prob_snapshot/cluster_17:0.015880548383360175 - cluster/prob_snapshot/cluster_18:0.018682998098070797 - cluster/prob_snapshot/cluster_19:0.015880548383360175 - cluster/prob_snapshot/cluster_20:0.015880548383360175 - cluster/prob_snapshot/cluster_21:0.015880548383360175 - cluster/prob_snapshot/cluster_22:0.018682998098070797 - cluster/prob_snapshot/cluster_23:0.013918833583062742 - cluster/prob_snapshot/cluster_24:0.015880548383360175 - cluster/prob_snapshot/cluster_25:0.018682998098070797 - cluster/prob_snapshot/cluster_26:0.015880548383360175 - cluster/prob_snapshot/cluster_27:0.015880548383360175 - cluster/prob_snapshot/cluster_28:0.013918833583062742 - cluster/prob_snapshot/cluster_29:0.018682998098070797 - cluster/prob_snapshot/cluster_30:0.013918833583062742 - cluster/prob_snapshot/cluster_31:0.013918833583062742 - cluster/prob_snapshot/cluster_32:0.013918833583062742 - cluster/prob_snapshot/cluster_33:0.013918833583062742 - cluster/prob_snapshot/cluster_34:0.018682998098070797 - cluster/prob_snapshot/cluster_35:0.015880548383360175 - cluster/prob_snapshot/cluster_36:0.013918833583062742 - cluster/prob_snapshot/cluster_37:0.015880548383360175 - cluster/prob_snapshot/cluster_38:0.018682998098070797 - cluster/prob_snapshot/cluster_39:0.015880548383360175 - cluster/prob_snapshot/cluster_40:0.015880548383360175 - cluster/prob_snapshot/cluster_41:0.013918833583062742 - cluster/prob_snapshot/cluster_42:0.015880548383360175 - cluster/prob_snapshot/cluster_43:0.018682998098070797 - cluster/prob_snapshot/cluster_44:0.015880548383360175 - cluster/prob_snapshot/cluster_45:0.018682998098070797 - cluster/prob_snapshot/cluster_46:0.015880548383360175 - cluster/prob_snapshot/cluster_47:0.018682998098070797 - cluster/prob_snapshot/cluster_48:0.015880548383360175 - cluster/prob_snapshot/cluster_49:0.012545633222854539 - cluster/prob_snapshot/cluster_50:0.015880548383360175 - cluster/prob_snapshot/cluster_51:0.015880548383360175 - cluster/prob_snapshot/cluster_52:0.013918833583062742 - cluster/prob_snapshot/cluster_53:0.015880548383360175 - cluster/prob_snapshot/cluster_54:0.011584392970708797 - cluster/prob_snapshot/cluster_55:0.015880548383360175 - cluster/prob_snapshot/cluster_56:0.018682998098070797 - cluster/prob_snapshot/cluster_57:0.013918833583062742 - cluster/prob_snapshot/cluster_58:0.013918833583062742 - cluster/prob_snapshot/cluster_59:0.015880548383360175 - cluster/prob_snapshot/cluster_60:0.015880548383360175 - cluster/prob_snapshot/cluster_61:0.018682998098070797 - cluster/prob_snapshot/cluster_62:0.015880548383360175 - cluster/prob_snapshot/cluster_63:0.018682998098070797
[36m(TaskRunner pid=2823680)[0m Training Progress:   1%|          | 6/800 [08:13<18:01:24, 81.72s/it]
[36m(TaskRunner pid=2823680)[0m step:6 - global_seqlen/min:332571 - global_seqlen/max:372907 - global_seqlen/minmax_diff:40336 - global_seqlen/balanced_min:352490 - global_seqlen/balanced_max:352625 - global_seqlen/mean:352572.5 - frontier/skipped_zero_acc_count:79.0 - actor/entropy:np.float64(1.0206670662760735) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.000927519635297358 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.042771145002916455) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003258687540437677) - actor/ppo_kl:np.float64(4.977182280640591e-05) - actor/pg_clipfrac_lower:np.float64(5.124526069266721e-07) - actor/grad_norm:np.float64(0.307577052286693) - perf/mfu/actor:np.float64(0.33815363785103575) - perf/max_memory_allocated_gb:np.float64(52.65984106063843) - perf/max_memory_reserved_gb:np.float64(61.890625) - perf/cpu_memory_used_gb:np.float64(111.6574478149414) - actor/lr:np.float64(1e-06) - training/global_step:6 - training/epoch:0 - critic/score/mean:0.2551020383834839 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.2541694641113281 - critic/rewards/max:1.0026342868804932 - critic/rewards/min:-0.010209819301962852 - critic/advantages/mean:-0.0628541111946106 - critic/advantages/max:2.4748618602752686 - critic/advantages/min:-2.4748589992523193 - critic/returns/mean:-0.0628541111946106 - critic/returns/max:2.4748618602752686 - critic/returns/min:-2.4748589992523193 - response_length/mean:1102.1123046875 - response_length/max:8192.0 - response_length/min:19.0 - response_length/clip_ratio:0.0076530613005161285 - response_length_non_aborted/mean:1102.1123046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:19.0 - response_length_non_aborted/clip_ratio:0.0076530613005161285 - response/aborted_ratio:0.0 - prompt_length/mean:243.85714721679688 - prompt_length/max:410.0 - prompt_length/min:188.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.366629481315613e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.32305164355784655) - timing_s/agent_loop/generate_sequences/max:np.float64(27.731652394868433) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.859505607324536) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.731652394868433) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:227 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.05668289028108 - timing_s/reward:0.0001714620739221573 - timing_s/old_log_prob:6.712658885866404 - timing_s/ref:11.707427806220949 - timing_s/adv:0.04310409724712372 - timing_s/update_actor:12.215351668186486 - timing_s/update_weights:20.611194293946028 - timing_s/step:80.68634286429733 - timing_s/stop_profile:6.523262709379196e-05 - timing_per_token_ms/adv:8.169534370782707e-05 - timing_per_token_ms/update_actor:0.023151798014075443 - timing_per_token_ms/gen:0.06725648080745017 - timing_per_token_ms/ref:0.02218912817220907 - perf/total_num_tokens:1410290 - perf/time_per_step:80.68634286429733 - perf/throughput:4369.667622598481 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:479.0 - frontier/mean_score:1.6181609375 - frontier/mean_frontier_pct:0.16015990697515237 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:0.0 - frontier/batch_hard_count:16.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:1.7 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:1.2401 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:1.49 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:1.49 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:1.7 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:1.7 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:1.2401 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:1.49 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:1.49 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.49 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:1.7 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:1.49 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:1.49 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.49 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:1.49 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.7 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:1.49 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:1.49 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:1.7 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:1.7 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:1.7 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:1.49 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:1.7 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.7 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.49 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:1.343 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.7 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.2401 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:1.49 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.343 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:1.343 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:1.7 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:6.0 - cluster/prob_snapshot/cluster_0:0.016415239908731265 - cluster/prob_snapshot/cluster_1:0.01438747497882917 - cluster/prob_snapshot/cluster_2:0.01438747497882917 - cluster/prob_snapshot/cluster_3:0.011974434712245673 - cluster/prob_snapshot/cluster_4:0.016415239908731265 - cluster/prob_snapshot/cluster_5:0.01438747497882917 - cluster/prob_snapshot/cluster_6:0.0129680395278977 - cluster/prob_snapshot/cluster_7:0.01438747497882917 - cluster/prob_snapshot/cluster_8:0.01438747497882917 - cluster/prob_snapshot/cluster_9:0.01438747497882917 - cluster/prob_snapshot/cluster_10:0.016415239908731265 - cluster/prob_snapshot/cluster_11:0.016415239908731265 - cluster/prob_snapshot/cluster_12:0.016415239908731265 - cluster/prob_snapshot/cluster_13:0.016415239908731265 - cluster/prob_snapshot/cluster_14:0.016415239908731265 - cluster/prob_snapshot/cluster_15:0.011974434712245673 - cluster/prob_snapshot/cluster_16:0.01438747497882917 - cluster/prob_snapshot/cluster_17:0.016415239908731265 - cluster/prob_snapshot/cluster_18:0.01931204695144855 - cluster/prob_snapshot/cluster_19:0.01438747497882917 - cluster/prob_snapshot/cluster_20:0.016415239908731265 - cluster/prob_snapshot/cluster_21:0.01438747497882917 - cluster/prob_snapshot/cluster_22:0.01931204695144855 - cluster/prob_snapshot/cluster_23:0.01438747497882917 - cluster/prob_snapshot/cluster_24:0.016415239908731265 - cluster/prob_snapshot/cluster_25:0.01931204695144855 - cluster/prob_snapshot/cluster_26:0.016415239908731265 - cluster/prob_snapshot/cluster_27:0.01438747497882917 - cluster/prob_snapshot/cluster_28:0.01438747497882917 - cluster/prob_snapshot/cluster_29:0.01931204695144855 - cluster/prob_snapshot/cluster_30:0.01438747497882917 - cluster/prob_snapshot/cluster_31:0.01438747497882917 - cluster/prob_snapshot/cluster_32:0.01438747497882917 - cluster/prob_snapshot/cluster_33:0.01438747497882917 - cluster/prob_snapshot/cluster_34:0.016415239908731265 - cluster/prob_snapshot/cluster_35:0.01438747497882917 - cluster/prob_snapshot/cluster_36:0.01438747497882917 - cluster/prob_snapshot/cluster_37:0.016415239908731265 - cluster/prob_snapshot/cluster_38:0.016415239908731265 - cluster/prob_snapshot/cluster_39:0.016415239908731265 - cluster/prob_snapshot/cluster_40:0.016415239908731265 - cluster/prob_snapshot/cluster_41:0.01438747497882917 - cluster/prob_snapshot/cluster_42:0.01438747497882917 - cluster/prob_snapshot/cluster_43:0.01931204695144855 - cluster/prob_snapshot/cluster_44:0.016415239908731265 - cluster/prob_snapshot/cluster_45:0.01931204695144855 - cluster/prob_snapshot/cluster_46:0.016415239908731265 - cluster/prob_snapshot/cluster_47:0.016415239908731265 - cluster/prob_snapshot/cluster_48:0.01438747497882917 - cluster/prob_snapshot/cluster_49:0.0129680395278977 - cluster/prob_snapshot/cluster_50:0.016415239908731265 - cluster/prob_snapshot/cluster_51:0.016415239908731265 - cluster/prob_snapshot/cluster_52:0.01438747497882917 - cluster/prob_snapshot/cluster_53:0.016415239908731265 - cluster/prob_snapshot/cluster_54:0.011974434712245673 - cluster/prob_snapshot/cluster_55:0.01438747497882917 - cluster/prob_snapshot/cluster_56:0.01931204695144855 - cluster/prob_snapshot/cluster_57:0.0129680395278977 - cluster/prob_snapshot/cluster_58:0.0129680395278977 - cluster/prob_snapshot/cluster_59:0.016415239908731265 - cluster/prob_snapshot/cluster_60:0.016415239908731265 - cluster/prob_snapshot/cluster_61:0.016415239908731265 - cluster/prob_snapshot/cluster_62:0.016415239908731265 - cluster/prob_snapshot/cluster_63:0.01931204695144855
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 11:40:38,873:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   1%|          | 7/800 [09:35<18:02:05, 81.87s/it]
[36m(TaskRunner pid=2823680)[0m step:7 - global_seqlen/min:329435 - global_seqlen/max:386652 - global_seqlen/minmax_diff:57217 - global_seqlen/balanced_min:365204 - global_seqlen/balanced_max:365274 - global_seqlen/mean:365252.75 - frontier/skipped_zero_acc_count:70.0 - actor/entropy:np.float64(0.772479358675151) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0014359778724610806 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.05555899761384353) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00026604108191757657) - actor/ppo_kl:np.float64(6.302817123253527e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2676794044673443) - perf/mfu/actor:np.float64(0.2547568734304532) - perf/max_memory_allocated_gb:np.float64(52.65984106063843) - perf/max_memory_reserved_gb:np.float64(61.890625) - perf/cpu_memory_used_gb:np.float64(111.92150115966797) - actor/lr:np.float64(1e-06) - training/global_step:7 - training/epoch:0 - critic/score/mean:0.2823275923728943 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.28062281012535095 - critic/rewards/max:1.0018641948699951 - critic/rewards/min:-0.00892594549804926 - critic/advantages/mean:-0.05316413566470146 - critic/advantages/max:2.4748587608337402 - critic/advantages/min:-2.4748647212982178 - critic/returns/mean:-0.05316413566470146 - critic/returns/max:2.4748587608337402 - critic/returns/min:-2.4748647212982178 - response_length/mean:1123.1142578125 - response_length/max:8192.0 - response_length/min:4.0 - response_length/clip_ratio:0.004310344811528921 - response_length_non_aborted/mean:1123.1142578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:4.0 - response_length_non_aborted/clip_ratio:0.004310344811528921 - response/aborted_ratio:0.0 - prompt_length/mean:239.63792419433594 - prompt_length/max:393.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.94772058725357e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.16555730067193508) - timing_s/agent_loop/generate_sequences/max:np.float64(28.056373425759375) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.235326389111833) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.056373425759375) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:249 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.967537217773497 - timing_s/reward:0.00010658055543899536 - timing_s/old_log_prob:6.655588950961828 - timing_s/ref:8.652436582371593 - timing_s/adv:0.04349329601973295 - timing_s/update_actor:17.274432277306914 - timing_s/update_weights:19.063630846329033 - timing_s/step:81.97874429635704 - timing_s/stop_profile:5.950871855020523e-05 - timing_per_token_ms/adv:6.878400552212411e-05 - timing_per_token_ms/update_actor:0.02731925960761282 - timing_per_token_ms/gen:0.057505468395823454 - timing_per_token_ms/ref:0.013683700710832688 - perf/total_num_tokens:1461011 - perf/time_per_step:81.97874429635704 - perf/throughput:4455.456754492287 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:549.0 - frontier/mean_score:1.5924952380952384 - frontier/mean_frontier_pct:0.1742793015442692 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:2.0 - frontier/batch_hard_count:14.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.09 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.343 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:1.2401 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:1.49 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:1.49 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:1.7 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:1.7 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:1.49 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:1.49 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.49 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:1.7 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.49 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:1.49 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:1.49 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.49 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:1.49 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.7 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:1.9429999999999998 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.343 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:1.49 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:1.7 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:1.49 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.7 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.7 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:1.343 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.7 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.2401 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:1.49 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.7 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.343 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:1.343 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:1.7 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:7.0 - cluster/prob_snapshot/cluster_0:0.020831838225326724 - cluster/prob_snapshot/cluster_1:0.013386200352446791 - cluster/prob_snapshot/cluster_2:0.01485140619891714 - cluster/prob_snapshot/cluster_3:0.012360556259917548 - cluster/prob_snapshot/cluster_4:0.016944557408160494 - cluster/prob_snapshot/cluster_5:0.01485140619891714 - cluster/prob_snapshot/cluster_6:0.013386200352446791 - cluster/prob_snapshot/cluster_7:0.01485140619891714 - cluster/prob_snapshot/cluster_8:0.01485140619891714 - cluster/prob_snapshot/cluster_9:0.01485140619891714 - cluster/prob_snapshot/cluster_10:0.016944557408160494 - cluster/prob_snapshot/cluster_11:0.016944557408160494 - cluster/prob_snapshot/cluster_12:0.016944557408160494 - cluster/prob_snapshot/cluster_13:0.016944557408160494 - cluster/prob_snapshot/cluster_14:0.016944557408160494 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.01485140619891714 - cluster/prob_snapshot/cluster_17:0.01485140619891714 - cluster/prob_snapshot/cluster_18:0.01993477342136529 - cluster/prob_snapshot/cluster_19:0.01485140619891714 - cluster/prob_snapshot/cluster_20:0.016944557408160494 - cluster/prob_snapshot/cluster_21:0.01485140619891714 - cluster/prob_snapshot/cluster_22:0.01993477342136529 - cluster/prob_snapshot/cluster_23:0.01485140619891714 - cluster/prob_snapshot/cluster_24:0.016944557408160494 - cluster/prob_snapshot/cluster_25:0.016944557408160494 - cluster/prob_snapshot/cluster_26:0.01485140619891714 - cluster/prob_snapshot/cluster_27:0.01485140619891714 - cluster/prob_snapshot/cluster_28:0.01485140619891714 - cluster/prob_snapshot/cluster_29:0.01993477342136529 - cluster/prob_snapshot/cluster_30:0.01485140619891714 - cluster/prob_snapshot/cluster_31:0.01485140619891714 - cluster/prob_snapshot/cluster_32:0.01485140619891714 - cluster/prob_snapshot/cluster_33:0.013386200352446791 - cluster/prob_snapshot/cluster_34:0.016944557408160494 - cluster/prob_snapshot/cluster_35:0.019366632378856375 - cluster/prob_snapshot/cluster_36:0.013386200352446791 - cluster/prob_snapshot/cluster_37:0.016944557408160494 - cluster/prob_snapshot/cluster_38:0.01485140619891714 - cluster/prob_snapshot/cluster_39:0.01485140619891714 - cluster/prob_snapshot/cluster_40:0.016944557408160494 - cluster/prob_snapshot/cluster_41:0.01485140619891714 - cluster/prob_snapshot/cluster_42:0.01485140619891714 - cluster/prob_snapshot/cluster_43:0.016944557408160494 - cluster/prob_snapshot/cluster_44:0.016944557408160494 - cluster/prob_snapshot/cluster_45:0.01993477342136529 - cluster/prob_snapshot/cluster_46:0.01485140619891714 - cluster/prob_snapshot/cluster_47:0.016944557408160494 - cluster/prob_snapshot/cluster_48:0.013386200352446791 - cluster/prob_snapshot/cluster_49:0.013386200352446791 - cluster/prob_snapshot/cluster_50:0.016944557408160494 - cluster/prob_snapshot/cluster_51:0.016944557408160494 - cluster/prob_snapshot/cluster_52:0.01485140619891714 - cluster/prob_snapshot/cluster_53:0.016944557408160494 - cluster/prob_snapshot/cluster_54:0.012360556259917548 - cluster/prob_snapshot/cluster_55:0.01485140619891714 - cluster/prob_snapshot/cluster_56:0.016944557408160494 - cluster/prob_snapshot/cluster_57:0.013386200352446791 - cluster/prob_snapshot/cluster_58:0.013386200352446791 - cluster/prob_snapshot/cluster_59:0.016944557408160494 - cluster/prob_snapshot/cluster_60:0.016944557408160494 - cluster/prob_snapshot/cluster_61:0.016944557408160494 - cluster/prob_snapshot/cluster_62:0.016944557408160494 - cluster/prob_snapshot/cluster_63:0.016944557408160494
[36m(TaskRunner pid=2823680)[0m Training Progress:   1%|          | 8/800 [11:00<18:12:16, 82.75s/it]
[36m(TaskRunner pid=2823680)[0m step:8 - global_seqlen/min:326575 - global_seqlen/max:381927 - global_seqlen/minmax_diff:55352 - global_seqlen/balanced_min:360361 - global_seqlen/balanced_max:360558 - global_seqlen/mean:360460.5 - frontier/skipped_zero_acc_count:60.0 - actor/entropy:np.float64(0.7224291387726279) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.003080888418480754 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.01925905328243971) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008205807892355712) - actor/ppo_kl:np.float64(7.180917203037145e-05) - actor/pg_clipfrac_lower:np.float64(1.0901321194764665e-06) - actor/grad_norm:np.float64(0.2792748345269097) - perf/mfu/actor:np.float64(0.21098669604423254) - perf/max_memory_allocated_gb:np.float64(52.65984106063843) - perf/max_memory_reserved_gb:np.float64(61.890625) - perf/cpu_memory_used_gb:np.float64(111.7881908416748) - actor/lr:np.float64(1e-06) - training/global_step:8 - training/epoch:0 - critic/score/mean:0.3125 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.31010162830352783 - critic/rewards/max:1.0028668642044067 - critic/rewards/min:-0.015371055342257023 - critic/advantages/mean:-0.07357174158096313 - critic/advantages/max:2.4748620986938477 - critic/advantages/min:-2.4748542308807373 - critic/returns/mean:-0.07357174158096313 - critic/returns/max:2.4748620986938477 - critic/returns/min:-2.4748542308807373 - response_length/mean:1074.18017578125 - response_length/max:8192.0 - response_length/min:9.0 - response_length/clip_ratio:0.005514706019312143 - response_length_non_aborted/mean:1074.18017578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:9.0 - response_length_non_aborted/clip_ratio:0.005514706019312143 - response/aborted_ratio:0.0 - prompt_length/mean:241.66175842285156 - prompt_length/max:527.0 - prompt_length/min:186.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.035637438297272e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.28194108977913857) - timing_s/agent_loop/generate_sequences/max:np.float64(27.096440544351935) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.141658643802657) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.096440544351935) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:220 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.615013715811074 - timing_s/reward:0.00020968634635210037 - timing_s/old_log_prob:7.709917593747377 - timing_s/ref:7.961838176473975 - timing_s/adv:0.07972087617963552 - timing_s/update_actor:19.929085806943476 - timing_s/update_weights:19.557602174580097 - timing_s/step:84.40544856060296 - timing_s/stop_profile:6.73411414027214e-05 - timing_per_token_ms/adv:0.00011137031505164096 - timing_per_token_ms/update_actor:0.027840995625904176 - timing_per_token_ms/gen:0.04896862811893317 - timing_per_token_ms/ref:0.011122713003129253 - perf/total_num_tokens:1441842 - perf/time_per_step:84.40544856060296 - perf/throughput:4270.58331123245 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:609.0 - frontier/mean_score:1.5653238095238096 - frontier/mean_frontier_pct:0.20346346903783666 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:2.0 - frontier/batch_hard_count:14.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.09 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:1.2401 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:1.9429999999999998 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.343 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.09 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:1.49 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:1.49 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:1.343 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.49 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:1.7 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.49 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.343 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:1.343 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.49 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.7 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:1.6601 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.343 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.49 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:1.343 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:1.7 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:1.49 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.7 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.7 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:1.343 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.7 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.2401 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:1.49 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.7 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.343 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:1.343 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:1.49 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:1.7 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:8.0 - cluster/prob_snapshot/cluster_0:0.021193444431599928 - cluster/prob_snapshot/cluster_1:0.012575115042883767 - cluster/prob_snapshot/cluster_2:0.015109202011044928 - cluster/prob_snapshot/cluster_3:0.012575115042883767 - cluster/prob_snapshot/cluster_4:0.017238686858239177 - cluster/prob_snapshot/cluster_5:0.015109202011044928 - cluster/prob_snapshot/cluster_6:0.01361856261800895 - cluster/prob_snapshot/cluster_7:0.019702805038563954 - cluster/prob_snapshot/cluster_8:0.01361856261800895 - cluster/prob_snapshot/cluster_9:0.015109202011044928 - cluster/prob_snapshot/cluster_10:0.021193444431599928 - cluster/prob_snapshot/cluster_11:0.017238686858239177 - cluster/prob_snapshot/cluster_12:0.017238686858239177 - cluster/prob_snapshot/cluster_13:0.015109202011044928 - cluster/prob_snapshot/cluster_14:0.017238686858239177 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.015109202011044928 - cluster/prob_snapshot/cluster_17:0.01361856261800895 - cluster/prob_snapshot/cluster_18:0.02028080806851668 - cluster/prob_snapshot/cluster_19:0.015109202011044928 - cluster/prob_snapshot/cluster_20:0.017238686858239177 - cluster/prob_snapshot/cluster_21:0.01361856261800895 - cluster/prob_snapshot/cluster_22:0.02028080806851668 - cluster/prob_snapshot/cluster_23:0.015109202011044928 - cluster/prob_snapshot/cluster_24:0.017238686858239177 - cluster/prob_snapshot/cluster_25:0.017238686858239177 - cluster/prob_snapshot/cluster_26:0.015109202011044928 - cluster/prob_snapshot/cluster_27:0.01361856261800895 - cluster/prob_snapshot/cluster_28:0.01361856261800895 - cluster/prob_snapshot/cluster_29:0.017238686858239177 - cluster/prob_snapshot/cluster_30:0.015109202011044928 - cluster/prob_snapshot/cluster_31:0.01361856261800895 - cluster/prob_snapshot/cluster_32:0.015109202011044928 - cluster/prob_snapshot/cluster_33:0.01361856261800895 - cluster/prob_snapshot/cluster_34:0.017238686858239177 - cluster/prob_snapshot/cluster_35:0.01683408473727227 - cluster/prob_snapshot/cluster_36:0.01361856261800895 - cluster/prob_snapshot/cluster_37:0.015109202011044928 - cluster/prob_snapshot/cluster_38:0.015109202011044928 - cluster/prob_snapshot/cluster_39:0.01361856261800895 - cluster/prob_snapshot/cluster_40:0.017238686858239177 - cluster/prob_snapshot/cluster_41:0.015109202011044928 - cluster/prob_snapshot/cluster_42:0.015109202011044928 - cluster/prob_snapshot/cluster_43:0.017238686858239177 - cluster/prob_snapshot/cluster_44:0.017238686858239177 - cluster/prob_snapshot/cluster_45:0.02028080806851668 - cluster/prob_snapshot/cluster_46:0.015109202011044928 - cluster/prob_snapshot/cluster_47:0.017238686858239177 - cluster/prob_snapshot/cluster_48:0.01361856261800895 - cluster/prob_snapshot/cluster_49:0.01361856261800895 - cluster/prob_snapshot/cluster_50:0.017238686858239177 - cluster/prob_snapshot/cluster_51:0.017238686858239177 - cluster/prob_snapshot/cluster_52:0.015109202011044928 - cluster/prob_snapshot/cluster_53:0.017238686858239177 - cluster/prob_snapshot/cluster_54:0.012575115042883767 - cluster/prob_snapshot/cluster_55:0.015109202011044928 - cluster/prob_snapshot/cluster_56:0.017238686858239177 - cluster/prob_snapshot/cluster_57:0.01361856261800895 - cluster/prob_snapshot/cluster_58:0.01361856261800895 - cluster/prob_snapshot/cluster_59:0.015109202011044928 - cluster/prob_snapshot/cluster_60:0.017238686858239177 - cluster/prob_snapshot/cluster_61:0.015109202011044928 - cluster/prob_snapshot/cluster_62:0.017238686858239177 - cluster/prob_snapshot/cluster_63:0.017238686858239177
[36m(TaskRunner pid=2823680)[0m Training Progress:   1%|          | 9/800 [12:33<18:54:10, 86.03s/it]
[36m(TaskRunner pid=2823680)[0m step:9 - global_seqlen/min:329092 - global_seqlen/max:396547 - global_seqlen/minmax_diff:67455 - global_seqlen/balanced_min:366654 - global_seqlen/balanced_max:366798 - global_seqlen/mean:366717.5 - frontier/skipped_zero_acc_count:54.0 - actor/entropy:np.float64(0.5474772189517279) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0049620019271969795 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.011252056236116914) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007594991099255354) - actor/ppo_kl:np.float64(8.857022996958344e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2752579778432846) - perf/mfu/actor:np.float64(0.27513770424965267) - perf/max_memory_allocated_gb:np.float64(55.85618591308594) - perf/max_memory_reserved_gb:np.float64(61.890625) - perf/cpu_memory_used_gb:np.float64(112.1150016784668) - actor/lr:np.float64(1e-06) - training/global_step:9 - training/epoch:0 - critic/score/mean:0.40371620655059814 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.40002554655075073 - critic/rewards/max:1.0025163888931274 - critic/rewards/min:-0.02527988702058792 - critic/advantages/mean:-0.1035698875784874 - critic/advantages/max:2.474853754043579 - critic/advantages/min:-2.4748544692993164 - critic/returns/mean:-0.1035698875784874 - critic/returns/max:2.474853754043579 - critic/returns/min:-2.4748544692993164 - response_length/mean:1106.3243408203125 - response_length/max:8192.0 - response_length/min:103.0 - response_length/clip_ratio:0.005067567341029644 - response_length_non_aborted/mean:1106.3243408203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:103.0 - response_length_non_aborted/clip_ratio:0.005067567341029644 - response/aborted_ratio:0.0 - prompt_length/mean:228.6081085205078 - prompt_length/max:363.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.756760507822037e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.2203143574297428) - timing_s/agent_loop/generate_sequences/max:np.float64(26.402164839208126) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.296501001736033) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(26.402164839208126) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:245 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:27.86219771951437 - timing_s/reward:0.00011914409697055817 - timing_s/old_log_prob:8.211449528113008 - timing_s/ref:16.514681762084365 - timing_s/adv:0.05583573877811432 - timing_s/update_actor:16.134237051941454 - timing_s/update_weights:23.15213152207434 - timing_s/step:92.69282905943692 - timing_s/stop_profile:4.9404799938201904e-05 - timing_per_token_ms/adv:7.065310874388105e-05 - timing_per_token_ms/update_actor:0.02041584887880429 - timing_per_token_ms/gen:0.0425413435645099 - timing_per_token_ms/ref:0.020897253836721624 - perf/total_num_tokens:1466870 - perf/time_per_step:92.69282905943692 - perf/throughput:3956.2661289025036 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:663.0 - frontier/mean_score:1.5628380952380951 - frontier/mean_frontier_pct:0.22201255194759542 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:5.0 - frontier/batch_hard_count:11.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:1.2401 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:1.6601 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.8400999999999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:2.3629999999999995 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.49 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:1.49 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:1.49 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.2401 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.49 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.09 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.343 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.343 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:1.343 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.49 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.7 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:1.6601 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.343 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.49 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:1.343 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.09 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:1.49 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.7 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.7 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:1.8400999999999998 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.49 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.2401 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:1.49 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.7 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.343 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:1.343 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:1.343 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:1.49 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.49 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:9.0 - cluster/prob_snapshot/cluster_0:0.017905966759700503 - cluster/prob_snapshot/cluster_1:0.012595115926661711 - cluster/prob_snapshot/cluster_2:0.015133233393053745 - cluster/prob_snapshot/cluster_3:0.012595115926661711 - cluster/prob_snapshot/cluster_4:0.017266105213551253 - cluster/prob_snapshot/cluster_5:0.015133233393053745 - cluster/prob_snapshot/cluster_6:0.01364022311870549 - cluster/prob_snapshot/cluster_7:0.016860859567656726 - cluster/prob_snapshot/cluster_8:0.018689035413797444 - cluster/prob_snapshot/cluster_9:0.01364022311870549 - cluster/prob_snapshot/cluster_10:0.023999886246836237 - cluster/prob_snapshot/cluster_11:0.015133233393053745 - cluster/prob_snapshot/cluster_12:0.017266105213551253 - cluster/prob_snapshot/cluster_13:0.015133233393053745 - cluster/prob_snapshot/cluster_14:0.017266105213551253 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.015133233393053745 - cluster/prob_snapshot/cluster_17:0.01364022311870549 - cluster/prob_snapshot/cluster_18:0.020313064957119122 - cluster/prob_snapshot/cluster_19:0.015133233393053745 - cluster/prob_snapshot/cluster_20:0.017266105213551253 - cluster/prob_snapshot/cluster_21:0.012595115926661711 - cluster/prob_snapshot/cluster_22:0.020313064957119122 - cluster/prob_snapshot/cluster_23:0.015133233393053745 - cluster/prob_snapshot/cluster_24:0.02122715288018948 - cluster/prob_snapshot/cluster_25:0.017266105213551253 - cluster/prob_snapshot/cluster_26:0.01364022311870549 - cluster/prob_snapshot/cluster_27:0.01364022311870549 - cluster/prob_snapshot/cluster_28:0.01364022311870549 - cluster/prob_snapshot/cluster_29:0.017266105213551253 - cluster/prob_snapshot/cluster_30:0.015133233393053745 - cluster/prob_snapshot/cluster_31:0.01364022311870549 - cluster/prob_snapshot/cluster_32:0.015133233393053745 - cluster/prob_snapshot/cluster_33:0.01364022311870549 - cluster/prob_snapshot/cluster_34:0.017266105213551253 - cluster/prob_snapshot/cluster_35:0.016860859567656726 - cluster/prob_snapshot/cluster_36:0.01364022311870549 - cluster/prob_snapshot/cluster_37:0.015133233393053745 - cluster/prob_snapshot/cluster_38:0.015133233393053745 - cluster/prob_snapshot/cluster_39:0.01364022311870549 - cluster/prob_snapshot/cluster_40:0.02122715288018948 - cluster/prob_snapshot/cluster_41:0.015133233393053745 - cluster/prob_snapshot/cluster_42:0.015133233393053745 - cluster/prob_snapshot/cluster_43:0.017266105213551253 - cluster/prob_snapshot/cluster_44:0.017266105213551253 - cluster/prob_snapshot/cluster_45:0.020313064957119122 - cluster/prob_snapshot/cluster_46:0.015133233393053745 - cluster/prob_snapshot/cluster_47:0.017266105213551253 - cluster/prob_snapshot/cluster_48:0.01364022311870549 - cluster/prob_snapshot/cluster_49:0.018689035413797444 - cluster/prob_snapshot/cluster_50:0.017266105213551253 - cluster/prob_snapshot/cluster_51:0.015133233393053745 - cluster/prob_snapshot/cluster_52:0.015133233393053745 - cluster/prob_snapshot/cluster_53:0.015133233393053745 - cluster/prob_snapshot/cluster_54:0.012595115926661711 - cluster/prob_snapshot/cluster_55:0.015133233393053745 - cluster/prob_snapshot/cluster_56:0.017266105213551253 - cluster/prob_snapshot/cluster_57:0.01364022311870549 - cluster/prob_snapshot/cluster_58:0.01364022311870549 - cluster/prob_snapshot/cluster_59:0.01364022311870549 - cluster/prob_snapshot/cluster_60:0.017266105213551253 - cluster/prob_snapshot/cluster_61:0.015133233393053745 - cluster/prob_snapshot/cluster_62:0.015133233393053745 - cluster/prob_snapshot/cluster_63:0.015133233393053745
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 11:44:56,540:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   1%|▏         | 10/800 [14:10<19:36:13, 89.33s/it]
[36m(TaskRunner pid=2823680)[0m step:10 - global_seqlen/min:338183 - global_seqlen/max:406202 - global_seqlen/minmax_diff:68019 - global_seqlen/balanced_min:372843 - global_seqlen/balanced_max:373015 - global_seqlen/mean:372940.0 - frontier/skipped_zero_acc_count:50.0 - actor/entropy:np.float64(0.5250899782165502) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0053262654691934586 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.015651320398319513) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002016325274430878) - actor/ppo_kl:np.float64(8.094202816886164e-05) - actor/pg_clipfrac_lower:np.float64(5.28201734339699e-07) - actor/grad_norm:np.float64(0.2470053568482399) - perf/mfu/actor:np.float64(0.27786630630536613) - perf/max_memory_allocated_gb:np.float64(61.204898834228516) - perf/max_memory_reserved_gb:np.float64(67.14453125) - perf/cpu_memory_used_gb:np.float64(112.42438125610352) - actor/lr:np.float64(1e-06) - training/global_step:10 - training/epoch:0 - critic/score/mean:0.3830128312110901 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.37860557436943054 - critic/rewards/max:1.0037246942520142 - critic/rewards/min:-0.026670554652810097 - critic/advantages/mean:-0.10783988237380981 - critic/advantages/max:2.474862813949585 - critic/advantages/min:-2.474835157394409 - critic/returns/mean:-0.10783988237380981 - critic/returns/max:2.474862813949585 - critic/returns/min:-2.474835157394409 - response_length/mean:1127.22119140625 - response_length/max:8192.0 - response_length/min:112.0 - response_length/clip_ratio:0.0032051282469183207 - response_length_non_aborted/mean:1127.22119140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:112.0 - response_length_non_aborted/clip_ratio:0.0032051282469183207 - response/aborted_ratio:0.0 - prompt_length/mean:231.14102172851562 - prompt_length/max:544.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.788984268903732e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.6619877871125937) - timing_s/agent_loop/generate_sequences/max:np.float64(27.64372748415917) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.992762191921429) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.64372748415917) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:234 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.580199494957924 - timing_s/reward:0.00021512527018785477 - timing_s/old_log_prob:8.689249594695866 - timing_s/ref:17.348312680609524 - timing_s/adv:0.07049304433166981 - timing_s/update_actor:15.815828866325319 - timing_s/update_weights:24.616247878409922 - timing_s/step:96.51680794823915 - timing_s/stop_profile:5.667470395565033e-05 - timing_per_token_ms/adv:8.31660539673176e-05 - timing_per_token_ms/update_actor:0.018659147005284597 - timing_per_token_ms/gen:0.04205400661224125 - timing_per_token_ms/ref:0.02046713576234757 - perf/total_num_tokens:1491760 - perf/time_per_step:96.51680794823915 - perf/throughput:3863.990199510156 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:713.0 - frontier/mean_score:1.5564872580645162 - frontier/mean_frontier_pct:0.23319395729655742 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:3.0 - frontier/batch_hard_count:13.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:1.7680699999999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.46207 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.5880699999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:2.5540999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.343 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.49 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:1.49 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:1.49 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:1.49 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.2401 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:1.343 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.09 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.343 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:1.343 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.49 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.7 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:1.6601 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.343 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.49 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:1.2401 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.09 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:1.343 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.7 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.7 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:1.8400999999999998 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.49 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:1.49 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:1.49 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.343 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:1.343 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:1.8400999999999998 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:1.49 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.49 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:10.0 - cluster/prob_snapshot/cluster_0:0.018269011663048958 - cluster/prob_snapshot/cluster_1:0.012850482906039146 - cluster/prob_snapshot/cluster_2:0.01544006090637717 - cluster/prob_snapshot/cluster_3:0.018321549319958576 - cluster/prob_snapshot/cluster_4:0.017616176873047777 - cluster/prob_snapshot/cluster_5:0.01544006090637717 - cluster/prob_snapshot/cluster_6:0.012850482906039146 - cluster/prob_snapshot/cluster_7:0.015150637482809978 - cluster/prob_snapshot/cluster_8:0.016456307062812342 - cluster/prob_snapshot/cluster_9:0.013916779729707743 - cluster/prob_snapshot/cluster_10:0.026466751383206658 - cluster/prob_snapshot/cluster_11:0.013916779729707743 - cluster/prob_snapshot/cluster_12:0.01544006090637717 - cluster/prob_snapshot/cluster_13:0.01544006090637717 - cluster/prob_snapshot/cluster_14:0.017616176873047777 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.01544006090637717 - cluster/prob_snapshot/cluster_17:0.012850482906039146 - cluster/prob_snapshot/cluster_18:0.020724913968291504 - cluster/prob_snapshot/cluster_19:0.01544006090637717 - cluster/prob_snapshot/cluster_20:0.01544006090637717 - cluster/prob_snapshot/cluster_21:0.012850482906039146 - cluster/prob_snapshot/cluster_22:0.020724913968291504 - cluster/prob_snapshot/cluster_23:0.013916779729707743 - cluster/prob_snapshot/cluster_24:0.02165753509686462 - cluster/prob_snapshot/cluster_25:0.017616176873047777 - cluster/prob_snapshot/cluster_26:0.012850482906039146 - cluster/prob_snapshot/cluster_27:0.013916779729707743 - cluster/prob_snapshot/cluster_28:0.013916779729707743 - cluster/prob_snapshot/cluster_29:0.017616176873047777 - cluster/prob_snapshot/cluster_30:0.01544006090637717 - cluster/prob_snapshot/cluster_31:0.013916779729707743 - cluster/prob_snapshot/cluster_32:0.01544006090637717 - cluster/prob_snapshot/cluster_33:0.013916779729707743 - cluster/prob_snapshot/cluster_34:0.017616176873047777 - cluster/prob_snapshot/cluster_35:0.01720271483938036 - cluster/prob_snapshot/cluster_36:0.013916779729707743 - cluster/prob_snapshot/cluster_37:0.01544006090637717 - cluster/prob_snapshot/cluster_38:0.01544006090637717 - cluster/prob_snapshot/cluster_39:0.012850482906039146 - cluster/prob_snapshot/cluster_40:0.02165753509686462 - cluster/prob_snapshot/cluster_41:0.013916779729707743 - cluster/prob_snapshot/cluster_42:0.01544006090637717 - cluster/prob_snapshot/cluster_43:0.017616176873047777 - cluster/prob_snapshot/cluster_44:0.017616176873047777 - cluster/prob_snapshot/cluster_45:0.020724913968291504 - cluster/prob_snapshot/cluster_46:0.01544006090637717 - cluster/prob_snapshot/cluster_47:0.017616176873047777 - cluster/prob_snapshot/cluster_48:0.013916779729707743 - cluster/prob_snapshot/cluster_49:0.019067957096526596 - cluster/prob_snapshot/cluster_50:0.017616176873047777 - cluster/prob_snapshot/cluster_51:0.01544006090637717 - cluster/prob_snapshot/cluster_52:0.01544006090637717 - cluster/prob_snapshot/cluster_53:0.01544006090637717 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.01544006090637717 - cluster/prob_snapshot/cluster_56:0.01544006090637717 - cluster/prob_snapshot/cluster_57:0.013916779729707743 - cluster/prob_snapshot/cluster_58:0.013916779729707743 - cluster/prob_snapshot/cluster_59:0.019067957096526596 - cluster/prob_snapshot/cluster_60:0.017616176873047777 - cluster/prob_snapshot/cluster_61:0.01544006090637717 - cluster/prob_snapshot/cluster_62:0.01544006090637717 - cluster/prob_snapshot/cluster_63:0.01544006090637717
[36m(TaskRunner pid=2823680)[0m Training Progress:   1%|▏         | 11/800 [15:38<19:30:03, 88.98s/it]
[36m(TaskRunner pid=2823680)[0m step:11 - global_seqlen/min:326578 - global_seqlen/max:461593 - global_seqlen/minmax_diff:135015 - global_seqlen/balanced_min:379797 - global_seqlen/balanced_max:379857 - global_seqlen/mean:379824.5 - frontier/skipped_zero_acc_count:61.0 - actor/entropy:np.float64(0.4765447439516292) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.005350890103727579 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07223927261657082) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005776677237864694) - actor/ppo_kl:np.float64(0.00010370637552542201) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.25051255027453107) - perf/mfu/actor:np.float64(0.22904403582984226) - perf/max_memory_allocated_gb:np.float64(61.204898834228516) - perf/max_memory_reserved_gb:np.float64(67.14453125) - perf/cpu_memory_used_gb:np.float64(112.08243179321289) - actor/lr:np.float64(1e-06) - training/global_step:11 - training/epoch:0 - critic/score/mean:0.427238792181015 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4222750663757324 - critic/rewards/max:1.00319242477417 - critic/rewards/min:-0.02596677467226982 - critic/advantages/mean:-0.10859233140945435 - critic/advantages/max:2.474860191345215 - critic/advantages/min:-2.474853277206421 - critic/returns/mean:-0.10859233140945435 - critic/returns/max:2.474860191345215 - critic/returns/min:-2.474853277206421 - response_length/mean:1127.1473388671875 - response_length/max:8192.0 - response_length/min:2.0 - response_length/clip_ratio:0.005597014911472797 - response_length_non_aborted/mean:1127.1473388671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:2.0 - response_length_non_aborted/clip_ratio:0.005597014911472797 - response/aborted_ratio:0.0 - prompt_length/mean:237.76119995117188 - prompt_length/max:395.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.925702422857285e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.23067271802574396) - timing_s/agent_loop/generate_sequences/max:np.float64(28.66403162293136) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.470662725138027) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.66403162293136) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:244 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.596650145947933 - timing_s/reward:0.0001230509951710701 - timing_s/old_log_prob:7.704601096920669 - timing_s/ref:8.980261565186083 - timing_s/adv:0.053031536750495434 - timing_s/update_actor:19.354114508256316 - timing_s/update_weights:20.957549885846674 - timing_s/step:87.98316734284163 - timing_s/stop_profile:5.737971514463425e-05 - timing_per_token_ms/adv:7.248795672786493e-05 - timing_per_token_ms/update_actor:0.026454828597202967 - timing_per_token_ms/gen:0.050644044528516764 - timing_per_token_ms/ref:0.012274975451018511 - perf/total_num_tokens:1519298 - perf/time_per_step:87.98316734284163 - perf/throughput:4317.013259137946 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:774.0 - frontier/mean_score:1.5585227096774195 - frontier/mean_frontier_pct:0.2591267221726272 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:5.0 - frontier/batch_hard_count:11.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:1.5376489999999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:1.3234489999999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.5880699999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:2.5540999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.343 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.49 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:1.49 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:1.9429999999999998 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:1.49 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.2401 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:1.343 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.09 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:1.2401 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:1.343 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.49 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:1.6601 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.343 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.49 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:1.2401 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:2.3629999999999995 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:1.8400999999999998 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.7 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.7 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.1880699999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.49 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:1.9429999999999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:1.49 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:1.343 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:1.5880699999999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:1.49 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:11.0 - cluster/prob_snapshot/cluster_0:0.01824515208819336 - cluster/prob_snapshot/cluster_1:0.01283370000259137 - cluster/prob_snapshot/cluster_2:0.013898604228272082 - cluster/prob_snapshot/cluster_3:0.015913011833952596 - cluster/prob_snapshot/cluster_4:0.015419895979244529 - cluster/prob_snapshot/cluster_5:0.015419895979244529 - cluster/prob_snapshot/cluster_6:0.01283370000259137 - cluster/prob_snapshot/cluster_7:0.013696272425392746 - cluster/prob_snapshot/cluster_8:0.016434814904536146 - cluster/prob_snapshot/cluster_9:0.013898604228272082 - cluster/prob_snapshot/cluster_10:0.026432185450059358 - cluster/prob_snapshot/cluster_11:0.013898604228272082 - cluster/prob_snapshot/cluster_12:0.015419895979244529 - cluster/prob_snapshot/cluster_13:0.015419895979244529 - cluster/prob_snapshot/cluster_14:0.01759316990920517 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.015419895979244529 - cluster/prob_snapshot/cluster_17:0.01283370000259137 - cluster/prob_snapshot/cluster_18:0.02069784695200608 - cluster/prob_snapshot/cluster_19:0.020107958313873905 - cluster/prob_snapshot/cluster_20:0.015419895979244529 - cluster/prob_snapshot/cluster_21:0.01283370000259137 - cluster/prob_snapshot/cluster_22:0.02069784695200608 - cluster/prob_snapshot/cluster_23:0.013898604228272082 - cluster/prob_snapshot/cluster_24:0.02162925006484635 - cluster/prob_snapshot/cluster_25:0.01759316990920517 - cluster/prob_snapshot/cluster_26:0.01283370000259137 - cluster/prob_snapshot/cluster_27:0.01283370000259137 - cluster/prob_snapshot/cluster_28:0.013898604228272082 - cluster/prob_snapshot/cluster_29:0.01759316990920517 - cluster/prob_snapshot/cluster_30:0.015419895979244529 - cluster/prob_snapshot/cluster_31:0.013898604228272082 - cluster/prob_snapshot/cluster_32:0.015419895979244529 - cluster/prob_snapshot/cluster_33:0.013898604228272082 - cluster/prob_snapshot/cluster_34:0.015419895979244529 - cluster/prob_snapshot/cluster_35:0.017180247862512644 - cluster/prob_snapshot/cluster_36:0.013898604228272082 - cluster/prob_snapshot/cluster_37:0.015419895979244529 - cluster/prob_snapshot/cluster_38:0.015419895979244529 - cluster/prob_snapshot/cluster_39:0.01283370000259137 - cluster/prob_snapshot/cluster_40:0.024454506173795177 - cluster/prob_snapshot/cluster_41:0.019043054088193193 - cluster/prob_snapshot/cluster_42:0.015419895979244529 - cluster/prob_snapshot/cluster_43:0.01759316990920517 - cluster/prob_snapshot/cluster_44:0.015419895979244529 - cluster/prob_snapshot/cluster_45:0.02069784695200608 - cluster/prob_snapshot/cluster_46:0.015419895979244529 - cluster/prob_snapshot/cluster_47:0.01759316990920517 - cluster/prob_snapshot/cluster_48:0.013898604228272082 - cluster/prob_snapshot/cluster_49:0.022644168990137967 - cluster/prob_snapshot/cluster_50:0.01759316990920517 - cluster/prob_snapshot/cluster_51:0.015419895979244529 - cluster/prob_snapshot/cluster_52:0.015419895979244529 - cluster/prob_snapshot/cluster_53:0.015419895979244529 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.020107958313873905 - cluster/prob_snapshot/cluster_56:0.015419895979244529 - cluster/prob_snapshot/cluster_57:0.01283370000259137 - cluster/prob_snapshot/cluster_58:0.013898604228272082 - cluster/prob_snapshot/cluster_59:0.016434814904536146 - cluster/prob_snapshot/cluster_60:0.01759316990920517 - cluster/prob_snapshot/cluster_61:0.013898604228272082 - cluster/prob_snapshot/cluster_62:0.015419895979244529 - cluster/prob_snapshot/cluster_63:0.013898604228272082
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 11:48:03,111:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▏         | 12/800 [17:11<19:46:32, 90.35s/it]
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 11:48:04,227:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:12 - global_seqlen/min:393842 - global_seqlen/max:423529 - global_seqlen/minmax_diff:29687 - global_seqlen/balanced_min:411281 - global_seqlen/balanced_max:411381 - global_seqlen/mean:411341.25 - frontier/skipped_zero_acc_count:57.0 - actor/entropy:np.float64(0.4044703987116615) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.007558329962193966 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02400769491214305) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00025626748265494825) - actor/ppo_kl:np.float64(5.655782295990422e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.24010943704181248) - perf/mfu/actor:np.float64(0.24257113227668836) - perf/max_memory_allocated_gb:np.float64(61.204898834228516) - perf/max_memory_reserved_gb:np.float64(67.14453125) - perf/cpu_memory_used_gb:np.float64(112.33883666992188) - actor/lr:np.float64(1e-06) - training/global_step:12 - training/epoch:0 - critic/score/mean:0.43838027119636536 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4320799708366394 - critic/rewards/max:1.0023778676986694 - critic/rewards/min:-0.044149141758680344 - critic/advantages/mean:-0.1372564136981964 - critic/advantages/max:2.474860429763794 - critic/advantages/min:-2.474858283996582 - critic/returns/mean:-0.1372564136981964 - critic/returns/max:2.474860429763794 - critic/returns/min:-2.474858283996582 - response_length/mean:1286.9542236328125 - response_length/max:8192.0 - response_length/min:88.0 - response_length/clip_ratio:0.01232394389808178 - response_length_non_aborted/mean:1286.9542236328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:88.0 - response_length_non_aborted/clip_ratio:0.01232394389808178 - response/aborted_ratio:0.0 - prompt_length/mean:233.23944091796875 - prompt_length/max:416.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010211672633886337 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.2789855198934674) - timing_s/agent_loop/generate_sequences/max:np.float64(28.842172197066247) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.414912030222695) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.842172197066247) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:219 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.376453111879528 - timing_s/reward:0.00019200798124074936 - timing_s/old_log_prob:7.833917198702693 - timing_s/ref:12.936375600285828 - timing_s/adv:0.05354455951601267 - timing_s/update_actor:19.794868441298604 - timing_s/update_weights:21.777580304071307 - timing_s/step:93.22302142437547 - timing_s/stop_profile:6.281957030296326e-05 - timing_per_token_ms/adv:6.201090890941511e-05 - timing_per_token_ms/update_actor:0.02292479002316074 - timing_per_token_ms/gen:0.041555223890722894 - timing_per_token_ms/ref:0.01498184719826494 - perf/total_num_tokens:1645365 - perf/time_per_step:93.22302142437547 - perf/throughput:4412.442803451601 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:831.0 - frontier/mean_score:1.5541892344262294 - frontier/mean_frontier_pct:0.28586022213238593 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:4.0 - frontier/batch_hard_count:12.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:1.5376489999999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.2264142999999998 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.5880699999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:1.2401 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:2.6878699999999993 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.8400999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:1.343 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:1.49 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:2.2600999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:1.49 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.2401 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:1.2401 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:1.763 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:1.2401 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:1.343 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.9429999999999998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:1.343 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:1.6601 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.343 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.49 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:1.2401 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:2.3629999999999995 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:1.8400999999999998 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.7 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.7 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.1880699999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.49 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:1.9429999999999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.343 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:1.343 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:1.5880699999999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:1.49 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:12.0 - cluster/prob_snapshot/cluster_0:0.018595959040298017 - cluster/prob_snapshot/cluster_1:0.013080458766802933 - cluster/prob_snapshot/cluster_2:0.014165838338695539 - cluster/prob_snapshot/cluster_3:0.016218977777853204 - cluster/prob_snapshot/cluster_4:0.015716380584256406 - cluster/prob_snapshot/cluster_5:0.015716380584256406 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.012936103283741214 - cluster/prob_snapshot/cluster_8:0.016750813768080585 - cluster/prob_snapshot/cluster_9:0.013080458766802933 - cluster/prob_snapshot/cluster_10:0.028351401262419636 - cluster/prob_snapshot/cluster_11:0.01940920262623504 - cluster/prob_snapshot/cluster_12:0.014165838338695539 - cluster/prob_snapshot/cluster_13:0.015716380584256406 - cluster/prob_snapshot/cluster_14:0.017931440935057646 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.015716380584256406 - cluster/prob_snapshot/cluster_17:0.013080458766802933 - cluster/prob_snapshot/cluster_18:0.017931440935057646 - cluster/prob_snapshot/cluster_19:0.023839323327837517 - cluster/prob_snapshot/cluster_20:0.015716380584256406 - cluster/prob_snapshot/cluster_21:0.013080458766802933 - cluster/prob_snapshot/cluster_22:0.0210958128647737 - cluster/prob_snapshot/cluster_23:0.013080458766802933 - cluster/prob_snapshot/cluster_24:0.018595959040298017 - cluster/prob_snapshot/cluster_25:0.017931440935057646 - cluster/prob_snapshot/cluster_26:0.013080458766802933 - cluster/prob_snapshot/cluster_27:0.013080458766802933 - cluster/prob_snapshot/cluster_28:0.014165838338695539 - cluster/prob_snapshot/cluster_29:0.017931440935057646 - cluster/prob_snapshot/cluster_30:0.020494582198127647 - cluster/prob_snapshot/cluster_31:0.014165838338695539 - cluster/prob_snapshot/cluster_32:0.015716380584256406 - cluster/prob_snapshot/cluster_33:0.014165838338695539 - cluster/prob_snapshot/cluster_34:0.014165838338695539 - cluster/prob_snapshot/cluster_35:0.01751057946840541 - cluster/prob_snapshot/cluster_36:0.014165838338695539 - cluster/prob_snapshot/cluster_37:0.015716380584256406 - cluster/prob_snapshot/cluster_38:0.015716380584256406 - cluster/prob_snapshot/cluster_39:0.013080458766802933 - cluster/prob_snapshot/cluster_40:0.02492470289973012 - cluster/prob_snapshot/cluster_41:0.01940920262623504 - cluster/prob_snapshot/cluster_42:0.015716380584256406 - cluster/prob_snapshot/cluster_43:0.017931440935057646 - cluster/prob_snapshot/cluster_44:0.015716380584256406 - cluster/prob_snapshot/cluster_45:0.017931440935057646 - cluster/prob_snapshot/cluster_46:0.015716380584256406 - cluster/prob_snapshot/cluster_47:0.017931440935057646 - cluster/prob_snapshot/cluster_48:0.013080458766802933 - cluster/prob_snapshot/cluster_49:0.02307955762751269 - cluster/prob_snapshot/cluster_50:0.017931440935057646 - cluster/prob_snapshot/cluster_51:0.015716380584256406 - cluster/prob_snapshot/cluster_52:0.015716380584256406 - cluster/prob_snapshot/cluster_53:0.015716380584256406 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.020494582198127647 - cluster/prob_snapshot/cluster_56:0.014165838338695539 - cluster/prob_snapshot/cluster_57:0.013080458766802933 - cluster/prob_snapshot/cluster_58:0.014165838338695539 - cluster/prob_snapshot/cluster_59:0.016750813768080585 - cluster/prob_snapshot/cluster_60:0.015716380584256406 - cluster/prob_snapshot/cluster_61:0.014165838338695539 - cluster/prob_snapshot/cluster_62:0.015716380584256406 - cluster/prob_snapshot/cluster_63:0.014165838338695539
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 11:49:36,872:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▏         | 13/800 [18:53<20:27:59, 93.62s/it]
[36m(TaskRunner pid=2823680)[0m step:13 - global_seqlen/min:313451 - global_seqlen/max:439378 - global_seqlen/minmax_diff:125927 - global_seqlen/balanced_min:384272 - global_seqlen/balanced_max:384287 - global_seqlen/mean:384280.0 - frontier/skipped_zero_acc_count:48.0 - actor/entropy:np.float64(0.40044774357229473) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008320018649101257 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.009860592712357175) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00029360194116634375) - actor/ppo_kl:np.float64(-2.3990997510736635e-05) - actor/pg_clipfrac_lower:np.float64(2.4347487851628103e-06) - actor/grad_norm:np.float64(0.24625305831432343) - perf/mfu/actor:np.float64(0.2657315168179532) - perf/max_memory_allocated_gb:np.float64(61.93788003921509) - perf/max_memory_reserved_gb:np.float64(67.970703125) - perf/cpu_memory_used_gb:np.float64(112.79935073852539) - actor/lr:np.float64(1e-06) - training/global_step:13 - training/epoch:0 - critic/score/mean:0.48124998807907104 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.475341796875 - critic/rewards/max:1.001556634902954 - critic/rewards/min:-0.02467767708003521 - critic/advantages/mean:-0.17281818389892578 - critic/advantages/max:2.474853992462158 - critic/advantages/min:-2.4748477935791016 - critic/returns/mean:-0.17281818389892578 - critic/returns/max:2.474853992462158 - critic/returns/min:-2.4748477935791016 - response_length/mean:1126.9390869140625 - response_length/max:8192.0 - response_length/min:87.0 - response_length/clip_ratio:0.00937500037252903 - response_length_non_aborted/mean:1126.9390869140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:87.0 - response_length_non_aborted/clip_ratio:0.00937500037252903 - response/aborted_ratio:0.0 - prompt_length/mean:228.875 - prompt_length/max:433.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.519645780324936e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.5559764411300421) - timing_s/agent_loop/generate_sequences/max:np.float64(27.77063743211329) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.72056988838267) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.77063743211329) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:299 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.74539395328611 - timing_s/reward:0.00012265890836715698 - timing_s/old_log_prob:9.153762052766979 - timing_s/ref:18.593323909677565 - timing_s/adv:0.06357978377491236 - timing_s/update_actor:16.98763000871986 - timing_s/update_weights:26.026435473002493 - timing_s/step:100.94673229847103 - timing_s/stop_profile:5.4737553000450134e-05 - timing_per_token_ms/adv:7.327215058171043e-05 - timing_per_token_ms/update_actor:0.01957729501616287 - timing_per_token_ms/gen:0.04124196205330272 - timing_per_token_ms/ref:0.02142776757699487 - perf/total_num_tokens:1537120 - perf/time_per_step:100.94673229847103 - perf/throughput:3806.760171927035 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:879.0 - frontier/mean_score:1.5866833896551724 - frontier/mean_frontier_pct:0.2950098728538803 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:3.0 - frontier/batch_hard_count:11.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:1.3763542999999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.2264142999999998 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.5880699999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:3.3815089999999994 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.8400999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:1.343 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.343 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:1.49 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.7680699999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:2.2600999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:1.49 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.2401 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:1.2401 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:1.763 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:1.2401 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:1.343 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:1.9429999999999998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:1.343 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:1.6601 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.343 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.49 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:1.2401 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.1540999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:1.8400999999999998 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.7 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.7 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.4316489999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.49 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.2600999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.343 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:1.2401 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:1.5880699999999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:1.49 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:13.0 - cluster/prob_snapshot/cluster_0:0.019157288670390564 - cluster/prob_snapshot/cluster_1:0.013475299875298549 - cluster/prob_snapshot/cluster_2:0.013475299875298549 - cluster/prob_snapshot/cluster_3:0.01495588011221403 - cluster/prob_snapshot/cluster_4:0.01619078849624614 - cluster/prob_snapshot/cluster_5:0.01619078849624614 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.013326586939645476 - cluster/prob_snapshot/cluster_8:0.01725644663572725 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.03674449464238441 - cluster/prob_snapshot/cluster_11:0.01999508047781377 - cluster/prob_snapshot/cluster_12:0.014593442248629908 - cluster/prob_snapshot/cluster_13:0.014593442248629908 - cluster/prob_snapshot/cluster_14:0.01619078849624614 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.01619078849624614 - cluster/prob_snapshot/cluster_17:0.019212380816481817 - cluster/prob_snapshot/cluster_18:0.018472711707126468 - cluster/prob_snapshot/cluster_19:0.024558926899574424 - cluster/prob_snapshot/cluster_20:0.01619078849624614 - cluster/prob_snapshot/cluster_21:0.013475299875298549 - cluster/prob_snapshot/cluster_22:0.018472711707126468 - cluster/prob_snapshot/cluster_23:0.013475299875298549 - cluster/prob_snapshot/cluster_24:0.019157288670390564 - cluster/prob_snapshot/cluster_25:0.01619078849624614 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.013475299875298549 - cluster/prob_snapshot/cluster_28:0.014593442248629908 - cluster/prob_snapshot/cluster_29:0.018472711707126468 - cluster/prob_snapshot/cluster_30:0.02111322285114513 - cluster/prob_snapshot/cluster_31:0.014593442248629908 - cluster/prob_snapshot/cluster_32:0.01619078849624614 - cluster/prob_snapshot/cluster_33:0.014593442248629908 - cluster/prob_snapshot/cluster_34:0.014593442248629908 - cluster/prob_snapshot/cluster_35:0.018039146297059203 - cluster/prob_snapshot/cluster_36:0.014593442248629908 - cluster/prob_snapshot/cluster_37:0.01619078849624614 - cluster/prob_snapshot/cluster_38:0.01619078849624614 - cluster/prob_snapshot/cluster_39:0.013475299875298549 - cluster/prob_snapshot/cluster_40:0.03427339999732211 - cluster/prob_snapshot/cluster_41:0.01999508047781377 - cluster/prob_snapshot/cluster_42:0.01619078849624614 - cluster/prob_snapshot/cluster_43:0.018472711707126468 - cluster/prob_snapshot/cluster_44:0.01619078849624614 - cluster/prob_snapshot/cluster_45:0.018472711707126468 - cluster/prob_snapshot/cluster_46:0.01619078849624614 - cluster/prob_snapshot/cluster_47:0.018472711707126468 - cluster/prob_snapshot/cluster_48:0.013475299875298549 - cluster/prob_snapshot/cluster_49:0.026423029970542564 - cluster/prob_snapshot/cluster_50:0.018472711707126468 - cluster/prob_snapshot/cluster_51:0.01619078849624614 - cluster/prob_snapshot/cluster_52:0.01619078849624614 - cluster/prob_snapshot/cluster_53:0.014593442248629908 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.024558926899574424 - cluster/prob_snapshot/cluster_56:0.014593442248629908 - cluster/prob_snapshot/cluster_57:0.013475299875298549 - cluster/prob_snapshot/cluster_58:0.013475299875298549 - cluster/prob_snapshot/cluster_59:0.01725644663572725 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014593442248629908 - cluster/prob_snapshot/cluster_62:0.01619078849624614 - cluster/prob_snapshot/cluster_63:0.014593442248629908
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▏         | 14/800 [20:41<21:25:37, 98.14s/it]
[36m(TaskRunner pid=2823680)[0m step:14 - global_seqlen/min:330533 - global_seqlen/max:462399 - global_seqlen/minmax_diff:131866 - global_seqlen/balanced_min:405890 - global_seqlen/balanced_max:405999 - global_seqlen/mean:405947.25 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.40134415628258574) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00829640869051218 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.14819657112821005) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00019080174798694126) - actor/ppo_kl:np.float64(-9.706903691880523e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23593098737976767) - perf/mfu/actor:np.float64(0.25263135451771285) - perf/max_memory_allocated_gb:np.float64(66.99166011810303) - perf/max_memory_reserved_gb:np.float64(72.580078125) - perf/cpu_memory_used_gb:np.float64(113.33294677734375) - actor/lr:np.float64(1e-06) - training/global_step:14 - training/epoch:0 - critic/score/mean:0.48235294222831726 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.47561272978782654 - critic/rewards/max:1.003830909729004 - critic/rewards/min:-0.04906509071588516 - critic/advantages/mean:-0.16567708551883698 - critic/advantages/max:2.4748451709747314 - critic/advantages/min:-2.4748523235321045 - critic/returns/mean:-0.16567708551883698 - critic/returns/max:2.4748451709747314 - critic/returns/min:-2.4748523235321045 - response_length/mean:1223.547119140625 - response_length/max:8192.0 - response_length/min:140.0 - response_length/clip_ratio:0.007352941203862429 - response_length_non_aborted/mean:1223.547119140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:140.0 - response_length_non_aborted/clip_ratio:0.007352941203862429 - response/aborted_ratio:0.0 - prompt_length/mean:239.15293884277344 - prompt_length/max:478.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.967798203229904e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1238551996648312) - timing_s/agent_loop/generate_sequences/max:np.float64(28.194714594632387) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.999124465513887) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.194714594632387) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:257 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.854625940322876 - timing_s/reward:0.000284331850707531 - timing_s/old_log_prob:9.235529363155365 - timing_s/ref:20.639950009062886 - timing_s/adv:0.06672813557088375 - timing_s/update_actor:18.85497944895178 - timing_s/update_weights:28.337586329318583 - timing_s/step:108.37368817813694 - timing_s/stop_profile:6.12279400229454e-05 - timing_per_token_ms/adv:6.708799557917042e-05 - timing_per_token_ms/update_actor:0.01895666298922599 - timing_per_token_ms/gen:0.037084352077040805 - timing_per_token_ms/ref:0.020751259766450123 - perf/total_num_tokens:1623789 - perf/time_per_step:108.37368817813694 - perf/throughput:3745.8100469251617 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:922.0 - frontier/mean_score:1.6037830622413793 - frontier/mean_frontier_pct:0.3101650404762983 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:1.2634480099999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.343 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.2264142999999998 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.5880699999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:3.8670562999999993 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.8400999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:1.343 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.343 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:1.9429999999999998 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.49 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.7680699999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:2.4820699999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.343 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.2401 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:1.2401 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:1.763 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:1.2401 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:1.8400999999999998 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.6601 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:1.343 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:1.6601 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.2401 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.343 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:1.2401 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.1078699999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:1.8400999999999998 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.09 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:1.49 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.4316489999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.49 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.2600999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.8400999999999998 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:1.2401 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:1.5880699999999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:1.49 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:14.0 - cluster/prob_snapshot/cluster_0:0.01895303201522592 - cluster/prob_snapshot/cluster_1:0.013331625072082624 - cluster/prob_snapshot/cluster_2:0.013331625072082624 - cluster/prob_snapshot/cluster_3:0.013582626536076845 - cluster/prob_snapshot/cluster_4:0.016018160920412153 - cluster/prob_snapshot/cluster_5:0.014437845715512428 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.013184497726506457 - cluster/prob_snapshot/cluster_8:0.01707245692139525 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.04157257053804939 - cluster/prob_snapshot/cluster_11:0.01978189121453047 - cluster/prob_snapshot/cluster_12:0.014437845715512428 - cluster/prob_snapshot/cluster_13:0.014437845715512428 - cluster/prob_snapshot/cluster_14:0.020888111857960273 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.016018160920412153 - cluster/prob_snapshot/cluster_17:0.019007536764129605 - cluster/prob_snapshot/cluster_18:0.018275754070268897 - cluster/prob_snapshot/cluster_19:0.026683353473642535 - cluster/prob_snapshot/cluster_20:0.014437845715512428 - cluster/prob_snapshot/cluster_21:0.013331625072082624 - cluster/prob_snapshot/cluster_22:0.018275754070268897 - cluster/prob_snapshot/cluster_23:0.013331625072082624 - cluster/prob_snapshot/cluster_24:0.01895303201522592 - cluster/prob_snapshot/cluster_25:0.016018160920412153 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.013331625072082624 - cluster/prob_snapshot/cluster_28:0.01978189121453047 - cluster/prob_snapshot/cluster_29:0.018275754070268897 - cluster/prob_snapshot/cluster_30:0.017846811371796115 - cluster/prob_snapshot/cluster_31:0.014437845715512428 - cluster/prob_snapshot/cluster_32:0.016018160920412153 - cluster/prob_snapshot/cluster_33:0.014437845715512428 - cluster/prob_snapshot/cluster_34:0.014437845715512428 - cluster/prob_snapshot/cluster_35:0.017846811371796115 - cluster/prob_snapshot/cluster_36:0.013331625072082624 - cluster/prob_snapshot/cluster_37:0.014437845715512428 - cluster/prob_snapshot/cluster_38:0.016018160920412153 - cluster/prob_snapshot/cluster_39:0.013331625072082624 - cluster/prob_snapshot/cluster_40:0.033410981060215644 - cluster/prob_snapshot/cluster_41:0.01978189121453047 - cluster/prob_snapshot/cluster_42:0.014437845715512428 - cluster/prob_snapshot/cluster_43:0.022468427062859998 - cluster/prob_snapshot/cluster_44:0.016018160920412153 - cluster/prob_snapshot/cluster_45:0.016018160920412153 - cluster/prob_snapshot/cluster_46:0.016018160920412153 - cluster/prob_snapshot/cluster_47:0.016018160920412153 - cluster/prob_snapshot/cluster_48:0.013331625072082624 - cluster/prob_snapshot/cluster_49:0.026141305358361934 - cluster/prob_snapshot/cluster_50:0.018275754070268897 - cluster/prob_snapshot/cluster_51:0.016018160920412153 - cluster/prob_snapshot/cluster_52:0.016018160920412153 - cluster/prob_snapshot/cluster_53:0.014437845715512428 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.024297077514243957 - cluster/prob_snapshot/cluster_56:0.01978189121453047 - cluster/prob_snapshot/cluster_57:0.013331625072082624 - cluster/prob_snapshot/cluster_58:0.013331625072082624 - cluster/prob_snapshot/cluster_59:0.01707245692139525 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014437845715512428 - cluster/prob_snapshot/cluster_62:0.016018160920412153 - cluster/prob_snapshot/cluster_63:0.014437845715512428
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 11:53:06,818:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▏         | 15/800 [22:31<22:11:26, 101.77s/it]
[36m(TaskRunner pid=2823680)[0m step:15 - global_seqlen/min:383408 - global_seqlen/max:411163 - global_seqlen/minmax_diff:27755 - global_seqlen/balanced_min:395590 - global_seqlen/balanced_max:395673 - global_seqlen/mean:395619.0 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.4400095431957134) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0084740174934268 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.026787254959344864) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00024758645464993776) - actor/ppo_kl:np.float64(0.00010114471238897601) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.24318433620712973) - perf/mfu/actor:np.float64(0.25222909639929575) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(113.53556823730469) - actor/lr:np.float64(1e-06) - training/global_step:15 - training/epoch:0 - critic/score/mean:0.43970587849617004 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4326062798500061 - critic/rewards/max:1.0026334524154663 - critic/rewards/min:-0.03810589760541916 - critic/advantages/mean:-0.12325119972229004 - critic/advantages/max:2.4748423099517822 - critic/advantages/min:-2.4748573303222656 - critic/returns/mean:-0.12325119972229004 - critic/returns/max:2.4748423099517822 - critic/returns/min:-2.4748573303222656 - response_length/mean:1267.12646484375 - response_length/max:8192.0 - response_length/min:26.0 - response_length/clip_ratio:0.007352941203862429 - response_length_non_aborted/mean:1267.12646484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:26.0 - response_length_non_aborted/clip_ratio:0.007352941203862429 - response/aborted_ratio:0.0 - prompt_length/mean:240.0941162109375 - prompt_length/max:502.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.59573483467102e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.33531659096479416) - timing_s/agent_loop/generate_sequences/max:np.float64(27.972781036049128) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.766380445654249) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.972781036049128) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:217 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.50844039209187 - timing_s/reward:0.0009000357240438461 - timing_s/old_log_prob:9.011186257004738 - timing_s/ref:21.863934634253383 - timing_s/adv:0.08972027339041233 - timing_s/update_actor:18.436080266721547 - timing_s/update_weights:29.580754675902426 - timing_s/step:109.94272406585515 - timing_s/stop_profile:6.58910721540451e-05 - timing_per_token_ms/adv:8.753966044863678e-05 - timing_per_token_ms/update_actor:0.017987999206487932 - timing_per_token_ms/gen:0.0354071630252933 - timing_per_token_ms/ref:0.021332541037021185 - perf/total_num_tokens:1582476 - perf/time_per_step:109.94272406585515 - perf/throughput:3598.4100208671034 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:965.0 - frontier/mean_score:1.6533993816666666 - frontier/mean_frontier_pct:0.31937750205730236 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:1.2634480099999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.343 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.5880699999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:3.8670562999999993 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:1.5880699999999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:1.343 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.343 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.2600999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.9429999999999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.7680699999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:2.4820699999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.343 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.2401 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:1.2401 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:1.763 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:1.8400999999999998 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:2.09 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.6601 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:1.343 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:1.343 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:1.6601 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.2401 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.343 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.0755089999999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.1880699999999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.09 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:1.343 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:1.49 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6021542999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.343 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.2600999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.8400999999999998 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:1.2401 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.011649 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:1.49 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:15.0 - cluster/prob_snapshot/cluster_0:0.019746074971455488 - cluster/prob_snapshot/cluster_1:0.013889454096484375 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.014150958098693274 - cluster/prob_snapshot/cluster_4:0.016688401422273784 - cluster/prob_snapshot/cluster_5:0.015041961818868249 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.017786811843402903 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.043312072387203214 - cluster/prob_snapshot/cluster_11:0.017786811843402903 - cluster/prob_snapshot/cluster_12:0.015041961818868249 - cluster/prob_snapshot/cluster_13:0.015041961818868249 - cluster/prob_snapshot/cluster_14:0.02531372889562481 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02176212346542145 - cluster/prob_snapshot/cluster_17:0.019802860337368864 - cluster/prob_snapshot/cluster_18:0.019040457998567403 - cluster/prob_snapshot/cluster_19:0.027799852696767168 - cluster/prob_snapshot/cluster_20:0.015041961818868249 - cluster/prob_snapshot/cluster_21:0.013889454096484375 - cluster/prob_snapshot/cluster_22:0.019040457998567403 - cluster/prob_snapshot/cluster_23:0.013889454096484375 - cluster/prob_snapshot/cluster_24:0.019746074971455488 - cluster/prob_snapshot/cluster_25:0.016688401422273784 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.020609615743037574 - cluster/prob_snapshot/cluster_29:0.023408563068826983 - cluster/prob_snapshot/cluster_30:0.018593567249071616 - cluster/prob_snapshot/cluster_31:0.013889454096484375 - cluster/prob_snapshot/cluster_32:0.016688401422273784 - cluster/prob_snapshot/cluster_33:0.015041961818868249 - cluster/prob_snapshot/cluster_34:0.015041961818868249 - cluster/prob_snapshot/cluster_35:0.018593567249071616 - cluster/prob_snapshot/cluster_36:0.013889454096484375 - cluster/prob_snapshot/cluster_37:0.015041961818868249 - cluster/prob_snapshot/cluster_38:0.016688401422273784 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.03444652937571531 - cluster/prob_snapshot/cluster_41:0.0245069734899561 - cluster/prob_snapshot/cluster_42:0.015041961818868249 - cluster/prob_snapshot/cluster_43:0.023408563068826983 - cluster/prob_snapshot/cluster_44:0.015041961818868249 - cluster/prob_snapshot/cluster_45:0.016688401422273784 - cluster/prob_snapshot/cluster_46:0.016688401422273784 - cluster/prob_snapshot/cluster_47:0.016688401422273784 - cluster/prob_snapshot/cluster_48:0.013889454096484375 - cluster/prob_snapshot/cluster_49:0.02914482920878915 - cluster/prob_snapshot/cluster_50:0.016688401422273784 - cluster/prob_snapshot/cluster_51:0.015041961818868249 - cluster/prob_snapshot/cluster_52:0.016688401422273784 - cluster/prob_snapshot/cluster_53:0.015041961818868249 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.02531372889562481 - cluster/prob_snapshot/cluster_56:0.020609615743037574 - cluster/prob_snapshot/cluster_57:0.013889454096484375 - cluster/prob_snapshot/cluster_58:0.013889454096484375 - cluster/prob_snapshot/cluster_59:0.02253101076021183 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.015041961818868249 - cluster/prob_snapshot/cluster_62:0.016688401422273784 - cluster/prob_snapshot/cluster_63:0.015041961818868249
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▏         | 16/800 [24:17<22:26:24, 103.04s/it]
[36m(TaskRunner pid=2823680)[0m step:16 - global_seqlen/min:381598 - global_seqlen/max:412610 - global_seqlen/minmax_diff:31012 - global_seqlen/balanced_min:395550 - global_seqlen/balanced_max:395656 - global_seqlen/mean:395598.5 - frontier/skipped_zero_acc_count:48.0 - actor/entropy:np.float64(0.3559919177554548) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.007000236306339502 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.005130635316163534) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0001698005487469345) - actor/ppo_kl:np.float64(-1.4684676476406366e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.21021923869848252) - perf/mfu/actor:np.float64(0.256189518777864) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.9197769165039) - actor/lr:np.float64(1e-06) - training/global_step:16 - training/epoch:0 - critic/score/mean:0.43281251192092896 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.42660975456237793 - critic/rewards/max:1.0029728412628174 - critic/rewards/min:-0.04153997823596001 - critic/advantages/mean:-0.10372818261384964 - critic/advantages/max:2.4748623371124268 - critic/advantages/min:-2.474856376647949 - critic/returns/mean:-0.10372818261384964 - critic/returns/max:2.4748623371124268 - critic/returns/min:-2.474856376647949 - response_length/mean:1223.5218505859375 - response_length/max:8192.0 - response_length/min:231.0 - response_length/clip_ratio:0.004687500186264515 - response_length_non_aborted/mean:1223.5218505859375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:231.0 - response_length_non_aborted/clip_ratio:0.004687500186264515 - response/aborted_ratio:0.0 - prompt_length/mean:233.64999389648438 - prompt_length/max:344.0 - prompt_length/min:187.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00011859368532896042 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.755908790975809) - timing_s/agent_loop/generate_sequences/max:np.float64(28.028035728260875) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.077149902012934) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.028035728260875) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:242 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.86740454658866 - timing_s/reward:0.0003839293494820595 - timing_s/old_log_prob:9.047442900016904 - timing_s/ref:20.407331318594515 - timing_s/adv:0.06882118713110685 - timing_s/update_actor:18.783251305110753 - timing_s/update_weights:27.179073533043265 - timing_s/step:105.7795991031453 - timing_s/stop_profile:0.00010492559522390366 - timing_per_token_ms/adv:7.379575926302754e-05 - timing_per_token_ms/update_actor:0.020140952942998266 - timing_per_token_ms/gen:0.038142202896082084 - timing_per_token_ms/ref:0.021882425630335427 - perf/total_num_tokens:1582394 - perf/time_per_step:105.7795991031453 - perf/throughput:3739.8373916529345 - frontier/active_count:53.0 - frontier/completed_count:11.0 - frontier/blacklisted_count:1013.0 - frontier/mean_score:1.683014615283019 - frontier/mean_frontier_pct:0.3384013331056346 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:1.2634480099999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.343 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.5880699999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:3.8670562999999993 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.011649 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:1.343 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.2401 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.4820699999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.9429999999999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.7680699999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.09 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.6374489999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.2401 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:1.2401 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:1.763 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.1880699999999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:2.09 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.46207 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:1.2401 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:1.343 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.2401 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.343 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.0755089999999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.1880699999999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.09 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:1.8400999999999998 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:1.49 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6021542999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.343 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:1.2401 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.4820699999999993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.8400999999999998 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:1.2401 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.011649 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:1.343 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:16.0 - cluster/prob_snapshot/cluster_0:0.019764623932159057 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.014164251148885271 - cluster/prob_snapshot/cluster_4:0.01670407808219909 - cluster/prob_snapshot/cluster_5:0.01505609185529757 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.017803520322146247 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.04335275864661738 - cluster/prob_snapshot/cluster_11:0.022552175818776988 - cluster/prob_snapshot/cluster_12:0.01505609185529757 - cluster/prob_snapshot/cluster_13:0.013902501496466504 - cluster/prob_snapshot/cluster_14:0.027825967171465696 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02178256625081398 - cluster/prob_snapshot/cluster_17:0.01982146264080117 - cluster/prob_snapshot/cluster_18:0.0234305524777155 - cluster/prob_snapshot/cluster_19:0.029567888613300602 - cluster/prob_snapshot/cluster_20:0.013902501496466504 - cluster/prob_snapshot/cluster_21:0.013902501496466504 - cluster/prob_snapshot/cluster_22:0.019058344120629834 - cluster/prob_snapshot/cluster_23:0.013902501496466504 - cluster/prob_snapshot/cluster_24:0.019764623932159057 - cluster/prob_snapshot/cluster_25:0.01505609185529757 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.024529994717662658 - cluster/prob_snapshot/cluster_29:0.0234305524777155 - cluster/prob_snapshot/cluster_30:0.0163909606990878 - cluster/prob_snapshot/cluster_31:0.013902501496466504 - cluster/prob_snapshot/cluster_32:0.01670407808219909 - cluster/prob_snapshot/cluster_33:0.013902501496466504 - cluster/prob_snapshot/cluster_34:0.01505609185529757 - cluster/prob_snapshot/cluster_35:0.0163909606990878 - cluster/prob_snapshot/cluster_36:0.013902501496466504 - cluster/prob_snapshot/cluster_37:0.01505609185529757 - cluster/prob_snapshot/cluster_38:0.01670407808219909 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.03447888756946713 - cluster/prob_snapshot/cluster_41:0.024529994717662658 - cluster/prob_snapshot/cluster_42:0.01505609185529757 - cluster/prob_snapshot/cluster_43:0.0234305524777155 - cluster/prob_snapshot/cluster_44:0.020628975891982915 - cluster/prob_snapshot/cluster_45:0.01670407808219909 - cluster/prob_snapshot/cluster_46:0.01670407808219909 - cluster/prob_snapshot/cluster_47:0.01670407808219909 - cluster/prob_snapshot/cluster_48:0.013902501496466504 - cluster/prob_snapshot/cluster_49:0.02917220712022155 - cluster/prob_snapshot/cluster_50:0.01670407808219909 - cluster/prob_snapshot/cluster_51:0.01505609185529757 - cluster/prob_snapshot/cluster_52:0.01670407808219909 - cluster/prob_snapshot/cluster_53:0.013902501496466504 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.027825967171465696 - cluster/prob_snapshot/cluster_56:0.020628975891982915 - cluster/prob_snapshot/cluster_57:0.013902501496466504 - cluster/prob_snapshot/cluster_58:0.013902501496466504 - cluster/prob_snapshot/cluster_59:0.022552175818776988 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01505609185529757 - cluster/prob_snapshot/cluster_62:0.01505609185529757 - cluster/prob_snapshot/cluster_63:0.01505609185529757
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▏         | 17/800 [26:00<22:23:37, 102.96s/it]
[36m(TaskRunner pid=2823680)[0m step:17 - global_seqlen/min:294336 - global_seqlen/max:444390 - global_seqlen/minmax_diff:150054 - global_seqlen/balanced_min:384312 - global_seqlen/balanced_max:384518 - global_seqlen/mean:384407.25 - frontier/skipped_zero_acc_count:51.0 - actor/entropy:np.float64(0.385968486372477) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009460646659135818 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0297732689359691) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00017513606167668538) - actor/ppo_kl:np.float64(-1.6020754625783295e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2415893480181694) - perf/mfu/actor:np.float64(0.26029548878081754) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.95125961303711) - actor/lr:np.float64(1e-06) - training/global_step:17 - training/epoch:0 - critic/score/mean:0.4529220759868622 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4463583827018738 - critic/rewards/max:1.0007437467575073 - critic/rewards/min:-0.028294678777456284 - critic/advantages/mean:-0.18808765709400177 - critic/advantages/max:2.4748599529266357 - critic/advantages/min:-2.4748456478118896 - critic/returns/mean:-0.18808765709400177 - critic/returns/max:2.4748599529266357 - critic/returns/min:-2.4748456478118896 - response_length/mean:1183.6607666015625 - response_length/max:8192.0 - response_length/min:142.0 - response_length/clip_ratio:0.012987012974917889 - response_length_non_aborted/mean:1183.6607666015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:142.0 - response_length_non_aborted/clip_ratio:0.012987012974917889 - response/aborted_ratio:0.0 - prompt_length/mean:241.9220733642578 - prompt_length/max:641.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.609145879745483e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1627117386087775) - timing_s/agent_loop/generate_sequences/max:np.float64(28.373954482376575) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.619012229917644) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.373954482376575) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:247 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.52467802911997 - timing_s/reward:0.00016037095338106155 - timing_s/old_log_prob:8.771035615354776 - timing_s/ref:19.06773021724075 - timing_s/adv:0.06833687331527472 - timing_s/update_actor:17.34576139599085 - timing_s/update_weights:26.40826226118952 - timing_s/step:102.57207028008997 - timing_s/stop_profile:6.320979446172714e-05 - timing_per_token_ms/adv:7.781833735721517e-05 - timing_per_token_ms/update_actor:0.01975241544639507 - timing_per_token_ms/gen:0.041864233686656066 - timing_per_token_ms/ref:0.02171330045839165 - perf/total_num_tokens:1537629 - perf/time_per_step:102.57207028008997 - perf/throughput:3747.679548149048 - frontier/active_count:49.0 - frontier/completed_count:15.0 - frontier/blacklisted_count:1064.0 - frontier/mean_score:1.690462224897959 - frontier/mean_frontier_pct:0.3582751588241199 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:4.0 - frontier/batch_hard_count:12.0 - frontier/force_completed_count:14.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:1.2634480099999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.343 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.5880699999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:3.6069394099999994 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.011649 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.2401 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.2401 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.037448999999999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:1.9429999999999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.7680699999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:1.763 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.6374489999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:1.763 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.4316489999999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.763 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.3234489999999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.343 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:1.2401 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:1.343 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.2401 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.343 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.0755089999999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.1880699999999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.09 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.1880699999999997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:1.49 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6021542999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.343 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.6374489999999993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.8400999999999998 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:1.7081543 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:1.343 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:17.0 - cluster/prob_snapshot/cluster_0:0.021283878046376648 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.015253019491081828 - cluster/prob_snapshot/cluster_4:0.017988076170789114 - cluster/prob_snapshot/cluster_5:0.016213413622395825 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.019172029613788635 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.04354489989966519 - cluster/prob_snapshot/cluster_11:0.02428570163818238 - cluster/prob_snapshot/cluster_12:0.014971149838520522 - cluster/prob_snapshot/cluster_13:0.014971149838520522 - cluster/prob_snapshot/cluster_14:0.02459717302422691 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.023456934228082715 - cluster/prob_snapshot/cluster_17:0.0213450857954947 - cluster/prob_snapshot/cluster_18:0.021283878046376648 - cluster/prob_snapshot/cluster_19:0.0318406936299138 - cluster/prob_snapshot/cluster_20:0.014971149838520522 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.020523308382779524 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.021283878046376648 - cluster/prob_snapshot/cluster_25:0.016213413622395825 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0293561660621632 - cluster/prob_snapshot/cluster_29:0.021283878046376648 - cluster/prob_snapshot/cluster_30:0.015977383503459515 - cluster/prob_snapshot/cluster_31:0.014971149838520522 - cluster/prob_snapshot/cluster_32:0.016213413622395825 - cluster/prob_snapshot/cluster_33:0.014971149838520522 - cluster/prob_snapshot/cluster_34:0.016213413622395825 - cluster/prob_snapshot/cluster_35:0.01765089028659439 - cluster/prob_snapshot/cluster_36:0.014971149838520522 - cluster/prob_snapshot/cluster_37:0.016213413622395825 - cluster/prob_snapshot/cluster_38:0.017988076170789114 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0371291880241258 - cluster/prob_snapshot/cluster_41:0.026415550219475524 - cluster/prob_snapshot/cluster_42:0.016213413622395825 - cluster/prob_snapshot/cluster_43:0.025231596776476004 - cluster/prob_snapshot/cluster_44:0.026415550219475524 - cluster/prob_snapshot/cluster_45:0.017988076170789114 - cluster/prob_snapshot/cluster_46:0.017988076170789114 - cluster/prob_snapshot/cluster_47:0.017988076170789114 - cluster/prob_snapshot/cluster_48:0.014971149838520522 - cluster/prob_snapshot/cluster_49:0.031414597152044574 - cluster/prob_snapshot/cluster_50:0.017988076170789114 - cluster/prob_snapshot/cluster_51:0.016213413622395825 - cluster/prob_snapshot/cluster_52:0.017988076170789114 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0318406936299138 - cluster/prob_snapshot/cluster_56:0.022214670444207413 - cluster/prob_snapshot/cluster_57:0.014971149838520522 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02062175144957111 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.016213413622395825 - cluster/prob_snapshot/cluster_62:0.016213413622395825 - cluster/prob_snapshot/cluster_63:0.014971149838520522
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▏         | 18/800 [27:36<21:52:50, 100.73s/it]
[36m(TaskRunner pid=2823680)[0m step:18 - global_seqlen/min:369326 - global_seqlen/max:490732 - global_seqlen/minmax_diff:121406 - global_seqlen/balanced_min:408190 - global_seqlen/balanced_max:408305 - global_seqlen/mean:408247.25 - frontier/skipped_zero_acc_count:54.0 - actor/entropy:np.float64(0.4196254411661947) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008516984060406685 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03942008355807047) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00016211336028268183) - actor/ppo_kl:np.float64(3.904508581381133e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22500828057527542) - perf/mfu/actor:np.float64(0.3045828706194756) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.56217575073242) - actor/lr:np.float64(1e-06) - training/global_step:18 - training/epoch:0 - critic/score/mean:0.4206081032752991 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4131574034690857 - critic/rewards/max:1.0009230375289917 - critic/rewards/min:-0.05197102576494217 - critic/advantages/mean:-0.1410631388425827 - critic/advantages/max:2.4748523235321045 - critic/advantages/min:-2.4748408794403076 - critic/returns/mean:-0.1410631388425827 - critic/returns/max:2.4748523235321045 - critic/returns/min:-2.4748408794403076 - response_length/mean:1227.023681640625 - response_length/max:8192.0 - response_length/min:149.0 - response_length/clip_ratio:0.005067567341029644 - response_length_non_aborted/mean:1227.023681640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:149.0 - response_length_non_aborted/clip_ratio:0.005067567341029644 - response/aborted_ratio:0.0 - prompt_length/mean:236.82432556152344 - prompt_length/max:346.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.890630513429642e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4542050380259752) - timing_s/agent_loop/generate_sequences/max:np.float64(28.114596697501838) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.505491919842825) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.114596697501838) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:245 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.04201129730791 - timing_s/reward:0.004744661971926689 - timing_s/old_log_prob:8.15154541656375 - timing_s/ref:16.869383794255555 - timing_s/adv:0.060672798193991184 - timing_s/update_actor:15.702786376699805 - timing_s/update_weights:24.100213211029768 - timing_s/step:95.30642331205308 - timing_s/stop_profile:6.106868386268616e-05 - timing_per_token_ms/adv:7.001262199311697e-05 - timing_per_token_ms/update_actor:0.018120035329760517 - timing_per_token_ms/gen:0.04135750827687839 - timing_per_token_ms/ref:0.019466215932018714 - perf/total_num_tokens:1632989 - perf/time_per_step:95.30642331205308 - perf/throughput:4283.5229338458485 - frontier/active_count:48.0 - frontier/completed_count:16.0 - frontier/blacklisted_count:1118.0 - frontier/mean_score:1.6939290130625 - frontier/mean_frontier_pct:0.38619628992029426 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:4.0 - frontier/batch_hard_count:12.0 - frontier/force_completed_count:15.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:1.7844136069999998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:1.343 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.2401 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:1.411649 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:3.6069394099999994 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:1.7081543 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.2401 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.2401 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.037448999999999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:1.6601 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.7680699999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:1.763 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.6374489999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:1.763 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.4316489999999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.1340999999999997 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.3234489999999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:1.7680699999999998 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:1.2401 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.2401 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.2401 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.0755089999999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.1880699999999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.09 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.1880699999999997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:1.49 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.7215080099999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.6374489999999993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:1.5880699999999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:1.7081543 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:1.343 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:18.0 - cluster/prob_snapshot/cluster_0:0.021682825185373625 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02194618735052921 - cluster/prob_snapshot/cluster_4:0.01651731946906227 - cluster/prob_snapshot/cluster_5:0.015251770568565987 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.017361621378393362 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0443611098589136 - cluster/prob_snapshot/cluster_11:0.02100828762140911 - cluster/prob_snapshot/cluster_12:0.015251770568565987 - cluster/prob_snapshot/cluster_13:0.015251770568565987 - cluster/prob_snapshot/cluster_14:0.025058224895697274 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.020417276284877345 - cluster/prob_snapshot/cluster_17:0.021745180218663385 - cluster/prob_snapshot/cluster_18:0.021682825185373625 - cluster/prob_snapshot/cluster_19:0.03243751877614207 - cluster/prob_snapshot/cluster_20:0.015251770568565987 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.020907999327926925 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.021682825185373625 - cluster/prob_snapshot/cluster_25:0.01651731946906227 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.029906420975149513 - cluster/prob_snapshot/cluster_29:0.02624691845042873 - cluster/prob_snapshot/cluster_30:0.016276865177967975 - cluster/prob_snapshot/cluster_31:0.015251770568565987 - cluster/prob_snapshot/cluster_32:0.015251770568565987 - cluster/prob_snapshot/cluster_33:0.021745180218663385 - cluster/prob_snapshot/cluster_34:0.015251770568565987 - cluster/prob_snapshot/cluster_35:0.01798174033963654 - cluster/prob_snapshot/cluster_36:0.015251770568565987 - cluster/prob_snapshot/cluster_37:0.015251770568565987 - cluster/prob_snapshot/cluster_38:0.018325246469771246 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.03782514123825482 - cluster/prob_snapshot/cluster_41:0.02691068593497474 - cluster/prob_snapshot/cluster_42:0.01651731946906227 - cluster/prob_snapshot/cluster_43:0.02570454035021604 - cluster/prob_snapshot/cluster_44:0.02691068593497474 - cluster/prob_snapshot/cluster_45:0.018325246469771246 - cluster/prob_snapshot/cluster_46:0.018325246469771246 - cluster/prob_snapshot/cluster_47:0.018325246469771246 - cluster/prob_snapshot/cluster_48:0.015251770568565987 - cluster/prob_snapshot/cluster_49:0.03347134567295749 - cluster/prob_snapshot/cluster_50:0.018325246469771246 - cluster/prob_snapshot/cluster_51:0.015251770568565987 - cluster/prob_snapshot/cluster_52:0.01651731946906227 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.03243751877614207 - cluster/prob_snapshot/cluster_56:0.019531392054529945 - cluster/prob_snapshot/cluster_57:0.015251770568565987 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02100828762140911 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01651731946906227 - cluster/prob_snapshot/cluster_62:0.01651731946906227 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 11:59:59,865:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▏         | 19/800 [29:23<22:16:17, 102.66s/it]
[36m(TaskRunner pid=2823680)[0m step:19 - global_seqlen/min:281777 - global_seqlen/max:404901 - global_seqlen/minmax_diff:123124 - global_seqlen/balanced_min:351340 - global_seqlen/balanced_max:351410 - global_seqlen/mean:351367.0 - frontier/skipped_zero_acc_count:47.0 - actor/entropy:np.float64(0.28361933051449495) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010486174374818802 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03901318262796849) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0001876669127629678) - actor/ppo_kl:np.float64(8.743699799968054e-07) - actor/pg_clipfrac_lower:np.float64(2.060543654185561e-07) - actor/grad_norm:np.float64(0.26208668744022195) - perf/mfu/actor:np.float64(0.22325088466371504) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(111.8182144165039) - actor/lr:np.float64(1e-06) - training/global_step:19 - training/epoch:0 - critic/score/mean:0.5972222089767456 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5899133086204529 - critic/rewards/max:1.0023728609085083 - critic/rewards/min:-0.04292679950594902 - critic/advantages/mean:-0.14561165869235992 - critic/advantages/max:2.4748616218566895 - critic/advantages/min:-2.474851369857788 - critic/returns/mean:-0.14561165869235992 - critic/returns/max:2.4748616218566895 - critic/returns/min:-2.474851369857788 - response_length/mean:938.370361328125 - response_length/max:8192.0 - response_length/min:156.0 - response_length/clip_ratio:0.006172839552164078 - response_length_non_aborted/mean:938.370361328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:156.0 - response_length_non_aborted/clip_ratio:0.006172839552164078 - response/aborted_ratio:0.0 - prompt_length/mean:237.92591857910156 - prompt_length/max:555.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.380971848964691e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1162822544574738) - timing_s/agent_loop/generate_sequences/max:np.float64(28.97038213443011) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.225214947359746) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.97038213443011) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:218 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.804369429126382 - timing_s/reward:0.00011926889419555664 - timing_s/old_log_prob:11.127881015650928 - timing_s/ref:21.343108561821282 - timing_s/adv:0.07570401858538389 - timing_s/update_actor:18.356913859024644 - timing_s/update_weights:23.850486430339515 - timing_s/step:106.9309555562213 - timing_s/stop_profile:5.192495882511139e-05 - timing_per_token_ms/adv:9.931782455051413e-05 - timing_per_token_ms/update_actor:0.024082852984656598 - timing_per_token_ms/gen:0.05230431242291335 - timing_per_token_ms/ref:0.02800050976309467 - perf/total_num_tokens:1405468 - perf/time_per_step:106.9309555562213 - perf/throughput:3285.9240635445462 - frontier/active_count:46.0 - frontier/completed_count:18.0 - frontier/blacklisted_count:1165.0 - frontier/mean_score:1.7585412875847828 - frontier/mean_frontier_pct:0.3964989720934929 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:16.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.1490895249 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.2401 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:1.411649 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.024857587 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:1.7081543 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.2401 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.2401 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:1.7262142999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:1.6601 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.7680699999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:1.763 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.6374489999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:1.763 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.6021542999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.1340999999999997 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.3234489999999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.1376489999999997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:1.7680699999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.2401 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.6528562999999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.1880699999999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.2401 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.09 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.1880699999999997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:1.343 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.8050556069999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.146214299999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.011649 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.2401 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.09570801 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:1.343 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:19.0 - cluster/prob_snapshot/cluster_0:0.02179424914677982 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.026567097302555458 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015330146549586872 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.017450839485991257 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0497553879928446 - cluster/prob_snapshot/cluster_11:0.0211162452611136 - cluster/prob_snapshot/cluster_12:0.015330146549586872 - cluster/prob_snapshot/cluster_13:0.015330146549586872 - cluster/prob_snapshot/cluster_14:0.021339503423104998 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02052219682845671 - cluster/prob_snapshot/cluster_17:0.021856924610860463 - cluster/prob_snapshot/cluster_18:0.02179424914677982 - cluster/prob_snapshot/cluster_19:0.03260420908560708 - cluster/prob_snapshot/cluster_20:0.015330146549586872 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.021015441604949343 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.02179424914677982 - cluster/prob_snapshot/cluster_25:0.015330146549586872 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.03216789514042225 - cluster/prob_snapshot/cluster_29:0.026381796428895522 - cluster/prob_snapshot/cluster_30:0.01636050892742859 - cluster/prob_snapshot/cluster_31:0.015330146549586872 - cluster/prob_snapshot/cluster_32:0.015330146549586872 - cluster/prob_snapshot/cluster_33:0.026425669253751973 - cluster/prob_snapshot/cluster_34:0.021856924610860463 - cluster/prob_snapshot/cluster_35:0.01807414512196958 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.015330146549586872 - cluster/prob_snapshot/cluster_38:0.018419416465514427 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.04515669897877724 - cluster/prob_snapshot/cluster_41:0.0270489748897303 - cluster/prob_snapshot/cluster_42:0.015330146549586872 - cluster/prob_snapshot/cluster_43:0.025836631149614195 - cluster/prob_snapshot/cluster_44:0.0270489748897303 - cluster/prob_snapshot/cluster_45:0.018419416465514427 - cluster/prob_snapshot/cluster_46:0.018419416465514427 - cluster/prob_snapshot/cluster_47:0.016602198867909982 - cluster/prob_snapshot/cluster_48:0.015330146549586872 - cluster/prob_snapshot/cluster_49:0.034676166063261314 - cluster/prob_snapshot/cluster_50:0.018419416465514427 - cluster/prob_snapshot/cluster_51:0.015330146549586872 - cluster/prob_snapshot/cluster_52:0.016602198867909982 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.026531553701974835 - cluster/prob_snapshot/cluster_56:0.024868054170091025 - cluster/prob_snapshot/cluster_57:0.015330146549586872 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025907193708929178 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.016602198867909982 - cluster/prob_snapshot/cluster_62:0.016602198867909982 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   2%|▎         | 20/800 [31:07<22:21:01, 103.16s/it]
[36m(TaskRunner pid=2823680)[0m step:20 - global_seqlen/min:355275 - global_seqlen/max:415042 - global_seqlen/minmax_diff:59767 - global_seqlen/balanced_min:386956 - global_seqlen/balanced_max:386987 - global_seqlen/mean:386972.75 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.37417788253653617) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009996707551181316 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06676438296653942) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00019759293164021247) - actor/ppo_kl:np.float64(1.097041429635023e-05) - actor/pg_clipfrac_lower:np.float64(1.6629085058368565e-06) - actor/grad_norm:np.float64(0.22722929580645126) - perf/mfu/actor:np.float64(0.26667529212298263) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.00902557373047) - actor/lr:np.float64(1e-06) - training/global_step:20 - training/epoch:0 - critic/score/mean:0.5 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4923945367336273 - critic/rewards/max:1.0012153387069702 - critic/rewards/min:-0.04855331778526306 - critic/advantages/mean:-0.17347776889801025 - critic/advantages/max:2.4748375415802 - critic/advantages/min:-2.474850654602051 - critic/returns/mean:-0.17347776889801025 - critic/returns/max:2.4748375415802 - critic/returns/min:-2.474850654602051 - response_length/mean:1123.18896484375 - response_length/max:8192.0 - response_length/min:123.0 - response_length/clip_ratio:0.004464285913854837 - response_length_non_aborted/mean:1123.18896484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:123.0 - response_length_non_aborted/clip_ratio:0.004464285913854837 - response/aborted_ratio:0.0 - prompt_length/mean:235.9761962890625 - prompt_length/max:364.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.937103509902954e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1077503934502602) - timing_s/agent_loop/generate_sequences/max:np.float64(29.34840289875865) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.754811136355784) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.34840289875865) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.174078760668635 - timing_s/reward:0.00022487901151180267 - timing_s/old_log_prob:8.823815965093672 - timing_s/ref:20.2397019835189 - timing_s/adv:0.06450705975294113 - timing_s/update_actor:17.12521509733051 - timing_s/update_weights:26.292960815131664 - timing_s/step:104.11072856280953 - timing_s/stop_profile:5.767308175563812e-05 - timing_per_token_ms/adv:7.062618286231497e-05 - timing_per_token_ms/update_actor:0.018749708600156685 - timing_per_token_ms/gen:0.04130204146180907 - timing_per_token_ms/ref:0.022159634911922802 - perf/total_num_tokens:1547891 - perf/time_per_step:104.11072856280953 - perf/throughput:3716.9344153282063 - frontier/active_count:42.0 - frontier/completed_count:22.0 - frontier/blacklisted_count:1209.0 - frontier/mean_score:1.8129625062571428 - frontier/mean_frontier_pct:0.39627399932574603 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:19.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.1490895249 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.2401 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:1.411649 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:3.7174003108999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:1.7081543 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.2401 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:1.7262142999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.5376489999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:1.763 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.746214299999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.1340999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.6021542999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.3234489999999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.7680699999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.3963542999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.1376489999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.343 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.6528562999999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.4316489999999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.2401 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.09 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.1880699999999997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.49 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:1.343 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.8050556069999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.4023500099999993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.011649 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.09570801 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:1.343 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:20.0 - cluster/prob_snapshot/cluster_0:0.023153369323037038 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02822385903475888 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.016286156152863433 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.01853909849772882 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.048820387022030856 - cluster/prob_snapshot/cluster_11:0.022433084156910837 - cluster/prob_snapshot/cluster_12:0.016286156152863433 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02267026501339072 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.019201274354017448 - cluster/prob_snapshot/cluster_17:0.020193848659216435 - cluster/prob_snapshot/cluster_18:0.023153369323037038 - cluster/prob_snapshot/cluster_19:0.036065861558766656 - cluster/prob_snapshot/cluster_20:0.016286156152863433 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.02232599424229323 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.02802700253675175 - cluster/prob_snapshot/cluster_25:0.016286156152863433 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.034173930540799154 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01738077338468749 - cluster/prob_snapshot/cluster_31:0.02321995331763023 - cluster/prob_snapshot/cluster_32:0.016286156152863433 - cluster/prob_snapshot/cluster_33:0.03147117194370271 - cluster/prob_snapshot/cluster_34:0.028073611332966986 - cluster/prob_snapshot/cluster_35:0.019201274354017448 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.01763753545141165 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.047972734542190906 - cluster/prob_snapshot/cluster_41:0.031934695043104755 - cluster/prob_snapshot/cluster_42:0.016286156152863433 - cluster/prob_snapshot/cluster_43:0.02744783998023109 - cluster/prob_snapshot/cluster_44:0.028735787189255613 - cluster/prob_snapshot/cluster_45:0.01956807730648054 - cluster/prob_snapshot/cluster_46:0.01956807730648054 - cluster/prob_snapshot/cluster_47:0.01763753545141165 - cluster/prob_snapshot/cluster_48:0.016286156152863433 - cluster/prob_snapshot/cluster_49:0.03683862078305549 - cluster/prob_snapshot/cluster_50:0.01763753545141165 - cluster/prob_snapshot/cluster_51:0.016286156152863433 - cluster/prob_snapshot/cluster_52:0.01763753545141165 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0315499132301371 - cluster/prob_snapshot/cluster_56:0.02641886117147937 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.027522802920463416 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01763753545141165 - cluster/prob_snapshot/cluster_62:0.01763753545141165 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 12:03:29,340:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   3%|▎         | 21/800 [32:51<22:23:32, 103.48s/it]
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 12:03:30,548:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:21 - global_seqlen/min:360670 - global_seqlen/max:455749 - global_seqlen/minmax_diff:95079 - global_seqlen/balanced_min:399499 - global_seqlen/balanced_max:399606 - global_seqlen/mean:399562.0 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.32047432980367113) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010225615464150906 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.034293789125513285) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00022639352534627375) - actor/ppo_kl:np.float64(2.3690202931041832e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.21900998462330212) - perf/mfu/actor:np.float64(0.26228248721817454) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.43279266357422) - actor/lr:np.float64(1e-06) - training/global_step:21 - training/epoch:0 - critic/score/mean:0.5044642686843872 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4969865381717682 - critic/rewards/max:1.0006223917007446 - critic/rewards/min:-0.04338687285780907 - critic/advantages/mean:-0.2386326938867569 - critic/advantages/max:2.474835157394409 - critic/advantages/min:-2.4748635292053223 - critic/returns/mean:-0.2386326938867569 - critic/returns/max:2.474835157394409 - critic/returns/min:-2.4748635292053223 - response_length/mean:1173.105712890625 - response_length/max:8192.0 - response_length/min:68.0 - response_length/clip_ratio:0.011904762126505375 - response_length_non_aborted/mean:1173.105712890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:68.0 - response_length_non_aborted/clip_ratio:0.011904762126505375 - response/aborted_ratio:0.0 - prompt_length/mean:231.01190185546875 - prompt_length/max:477.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.319186210632324e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.629387921653688) - timing_s/agent_loop/generate_sequences/max:np.float64(28.82880289107561) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.818309522141135) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.82880289107561) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.12812975887209 - timing_s/reward:0.0001318240538239479 - timing_s/old_log_prob:9.106413532979786 - timing_s/ref:19.62498962134123 - timing_s/adv:0.05837791133671999 - timing_s/update_actor:18.013032022863626 - timing_s/update_weights:26.747335641644895 - timing_s/step:104.04306548647583 - timing_s/stop_profile:5.949102342128754e-05 - timing_per_token_ms/adv:6.186938642059333e-05 - timing_per_token_ms/update_actor:0.019090358207592705 - timing_per_token_ms/gen:0.038217807786454214 - timing_per_token_ms/ref:0.02079872401360076 - perf/total_num_tokens:1598248 - perf/time_per_step:104.04306548647583 - perf/throughput:3840.352051640939 - frontier/active_count:41.0 - frontier/completed_count:23.0 - frontier/blacklisted_count:1253.0 - frontier/mean_score:1.8510725322917072 - frontier/mean_frontier_pct:0.43296979965447957 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:20.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.40436266743 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.2401 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:1.8881542999999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:3.5021802176299994 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:1.7081543 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.7680699999999998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:1.7262142999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.5376489999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:1.763 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.746214299999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.3938699999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.6021542999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.3234489999999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:1.5376489999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.3963542999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.1376489999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.343 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.6528562999999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.4316489999999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.3629999999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.1880699999999997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:1.343 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:1.343 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.8635389248999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.581645006999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:1.7081543 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.366995607 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:1.2401 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:21.0 - cluster/prob_snapshot/cluster_0:0.0232297758460951 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.031680547825932055 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.016339900752548236 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.02487884353479331 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.046145695648425106 - cluster/prob_snapshot/cluster_11:0.022507113726343442 - cluster/prob_snapshot/cluster_12:0.023296579569033107 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02274507728379124 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.019264638894668333 - cluster/prob_snapshot/cluster_17:0.020260488712406287 - cluster/prob_snapshot/cluster_18:0.0232297758460951 - cluster/prob_snapshot/cluster_19:0.036184879531673826 - cluster/prob_snapshot/cluster_20:0.016339900752548236 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.019632652303279468 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.031542293536410485 - cluster/prob_snapshot/cluster_25:0.016339900752548236 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.03428670510831112 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017438130240350944 - cluster/prob_snapshot/cluster_31:0.020260488712406287 - cluster/prob_snapshot/cluster_32:0.016339900752548236 - cluster/prob_snapshot/cluster_33:0.031575027360650106 - cluster/prob_snapshot/cluster_34:0.028166254740572515 - cluster/prob_snapshot/cluster_35:0.019264638894668333 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.017695739626378743 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.048131045403854975 - cluster/prob_snapshot/cluster_41:0.03204008009437397 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03113554187426133 - cluster/prob_snapshot/cluster_44:0.028830615788749465 - cluster/prob_snapshot/cluster_45:0.019632652303279468 - cluster/prob_snapshot/cluster_46:0.017695739626378743 - cluster/prob_snapshot/cluster_47:0.017695739626378743 - cluster/prob_snapshot/cluster_48:0.016339900752548236 - cluster/prob_snapshot/cluster_49:0.03773078125467677 - cluster/prob_snapshot/cluster_50:0.016339900752548236 - cluster/prob_snapshot/cluster_51:0.016339900752548236 - cluster/prob_snapshot/cluster_52:0.017695739626378743 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0340164689885426 - cluster/prob_snapshot/cluster_56:0.022507113726343442 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.031188189097732168 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017695739626378743 - cluster/prob_snapshot/cluster_62:0.016339900752548236 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   3%|▎         | 22/800 [34:35<22:23:16, 103.59s/it]
[36m(TaskRunner pid=2823680)[0m step:22 - global_seqlen/min:375340 - global_seqlen/max:471021 - global_seqlen/minmax_diff:95681 - global_seqlen/balanced_min:405964 - global_seqlen/balanced_max:406062 - global_seqlen/mean:405999.25 - frontier/skipped_zero_acc_count:51.0 - actor/entropy:np.float64(0.34532428198517895) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009774315170943737 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.009544390355586074) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002034199434348967) - actor/ppo_kl:np.float64(-4.476290667864118e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22501959502696992) - perf/mfu/actor:np.float64(0.26637581781537095) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.4375114440918) - actor/lr:np.float64(1e-06) - training/global_step:22 - training/epoch:0 - critic/score/mean:0.5146104097366333 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5066704750061035 - critic/rewards/max:1.0011584758758545 - critic/rewards/min:-0.04852114990353584 - critic/advantages/mean:-0.16522687673568726 - critic/advantages/max:2.4748566150665283 - critic/advantages/min:-2.47485613822937 - critic/returns/mean:-0.16522687673568726 - critic/returns/max:2.4748566150665283 - critic/returns/min:-2.47485613822937 - response_length/mean:1254.6234130859375 - response_length/max:8192.0 - response_length/min:35.0 - response_length/clip_ratio:0.009740259498357773 - response_length_non_aborted/mean:1254.6234130859375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:35.0 - response_length_non_aborted/clip_ratio:0.009740259498357773 - response/aborted_ratio:0.0 - prompt_length/mean:227.42857360839844 - prompt_length/max:328.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010617636144161224 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.46126383263617754) - timing_s/agent_loop/generate_sequences/max:np.float64(29.394455700181425) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.067195286088463) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.394455700181425) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:221 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.43567076418549 - timing_s/reward:0.00031210295855998993 - timing_s/old_log_prob:8.600847211666405 - timing_s/ref:19.112075805664062 - timing_s/adv:0.062275348231196404 - timing_s/update_actor:17.907256082631648 - timing_s/update_weights:26.03188073541969 - timing_s/step:103.62442328501493 - timing_s/stop_profile:5.7285651564598083e-05 - timing_per_token_ms/adv:6.821376582922545e-05 - timing_per_token_ms/update_actor:0.01961484612706984 - timing_per_token_ms/gen:0.040675101396633606 - timing_per_token_ms/ref:0.020934554370984487 - perf/total_num_tokens:1623997 - perf/time_per_step:103.62442328501493 - perf/throughput:3917.988029552791 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1302.0 - frontier/mean_score:1.9049263952281847 - frontier/mean_frontier_pct:0.4459024964452806 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:23.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:1.763 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.40436266743 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:1.8881542999999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:3.351526152340999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:1.49570801 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.7680699999999998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.1083500099999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.5376489999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:1.763 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.746214299999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.3938699999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.1215080099999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8264142999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:1.5376489999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.5774480099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.1376489999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.8400999999999998 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.4569994099999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:2.0021542999999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.5540999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:1.8316489999999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:1.343 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:1.343 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.8635389248999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.581645006999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:1.49570801 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.366995607 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.8400999999999998 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:22.0 - cluster/prob_snapshot/cluster_0:0.02435513359378266 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03321530004149817 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.026084089745987055 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.04629998138588849 - cluster/prob_snapshot/cluster_11:0.020662602609665803 - cluster/prob_snapshot/cluster_12:0.024425173597934943 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02912600462620703 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.020197907075134323 - cluster/prob_snapshot/cluster_17:0.021242000462476634 - cluster/prob_snapshot/cluster_18:0.02435513359378266 - cluster/prob_snapshot/cluster_19:0.03793784240139326 - cluster/prob_snapshot/cluster_20:0.017131481094526304 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.020583748754813478 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0330703480749566 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.029307777086687456 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02523117655932787 - cluster/prob_snapshot/cluster_31:0.021242000462476634 - cluster/prob_snapshot/cluster_32:0.017131481094526304 - cluster/prob_snapshot/cluster_33:0.03560640420566038 - cluster/prob_snapshot/cluster_34:0.029530758350320988 - cluster/prob_snapshot/cluster_35:0.020197907075134323 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.02542023898237066 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.04775705187985129 - cluster/prob_snapshot/cluster_41:0.027658953744677476 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.035283860868905434 - cluster/prob_snapshot/cluster_44:0.02530349182752037 - cluster/prob_snapshot/cluster_45:0.020583748754813478 - cluster/prob_snapshot/cluster_46:0.018553003072291612 - cluster/prob_snapshot/cluster_47:0.018553003072291612 - cluster/prob_snapshot/cluster_48:0.017131481094526304 - cluster/prob_snapshot/cluster_49:0.039558634751523686 - cluster/prob_snapshot/cluster_50:0.017131481094526304 - cluster/prob_snapshot/cluster_51:0.017131481094526304 - cluster/prob_snapshot/cluster_52:0.018553003072291612 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.03566438402564206 - cluster/prob_snapshot/cluster_56:0.020662602609665803 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.032699089180023636 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.02542023898237066 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   3%|▎         | 23/800 [36:18<22:17:48, 103.31s/it]
[36m(TaskRunner pid=2823680)[0m step:23 - global_seqlen/min:343785 - global_seqlen/max:422791 - global_seqlen/minmax_diff:79006 - global_seqlen/balanced_min:387333 - global_seqlen/balanced_max:387451 - global_seqlen/mean:387399.75 - frontier/skipped_zero_acc_count:46.0 - actor/entropy:np.float64(0.3477199455586875) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010902217589318752 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03716694097965956) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00018872714991164508) - actor/ppo_kl:np.float64(2.1400105780062903e-05) - actor/pg_clipfrac_lower:np.float64(6.734203563565843e-07) - actor/grad_norm:np.float64(0.2175768349658359) - perf/mfu/actor:np.float64(0.26047867100456507) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.39956665039062) - actor/lr:np.float64(1e-06) - training/global_step:23 - training/epoch:0 - critic/score/mean:0.5106707215309143 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5033752918243408 - critic/rewards/max:1.0016578435897827 - critic/rewards/min:-0.05378391221165657 - critic/advantages/mean:-0.17349521815776825 - critic/advantages/max:2.4748566150665283 - critic/advantages/min:-2.474855899810791 - critic/returns/mean:-0.17349521815776825 - critic/returns/max:2.4748566150665283 - critic/returns/min:-2.474855899810791 - response_length/mean:1085.2713623046875 - response_length/max:8192.0 - response_length/min:143.0 - response_length/clip_ratio:0.006097560748457909 - response_length_non_aborted/mean:1085.2713623046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:143.0 - response_length_non_aborted/clip_ratio:0.006097560748457909 - response/aborted_ratio:0.0 - prompt_length/mean:236.63414001464844 - prompt_length/max:409.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.768440991640091e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5593676073476672) - timing_s/agent_loop/generate_sequences/max:np.float64(29.610431311652064) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.055739180633282) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.610431311652064) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:242 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.390642025507987 - timing_s/reward:0.0001122448593378067 - timing_s/old_log_prob:9.324006334878504 - timing_s/ref:18.18419114407152 - timing_s/adv:0.06815768126398325 - timing_s/update_actor:17.57711460068822 - timing_s/update_weights:25.130828914232552 - timing_s/step:102.07517239637673 - timing_s/stop_profile:5.242787301540375e-05 - timing_per_token_ms/adv:7.859783117956485e-05 - timing_per_token_ms/update_actor:0.020269514167566012 - timing_per_token_ms/gen:0.04409181982912555 - timing_per_token_ms/ref:0.020969580525239017 - perf/total_num_tokens:1549599 - perf/time_per_step:102.07517239637673 - perf/throughput:3795.239732690877 - frontier/active_count:37.0 - frontier/completed_count:27.0 - frontier/blacklisted_count:1347.0 - frontier/mean_score:1.9473323434235403 - frontier/mean_frontier_pct:0.45646821774947716 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:23.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:2.1340999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.40436266743 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:1.6217080099999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:3.351526152340999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:1.49570801 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.7680699999999998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.3758450069999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:1.3763542999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:1.5340999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.746214299999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.3938699999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.3850556069999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8264142999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.3763542999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.7680699999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.5774480099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.1376489999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.8400999999999998 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:3.319899586999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:1.7015080099999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.6878699999999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.1821542999999997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:1.2401 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:1.8400999999999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.8635389248999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.581645006999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:1.9469956069999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.366995607 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:23.0 - cluster/prob_snapshot/cluster_0:0.029619175470056602 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.033370151230149726 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.022507686663880002 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.04651583393405732 - cluster/prob_snapshot/cluster_11:0.02075893257118185 - cluster/prob_snapshot/cluster_12:0.024539044830768464 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03297435460006132 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.020292070605644377 - cluster/prob_snapshot/cluster_17:0.019102422342283366 - cluster/prob_snapshot/cluster_18:0.021291775028636818 - cluster/prob_snapshot/cluster_19:0.038114710290088866 - cluster/prob_snapshot/cluster_20:0.017211348812341126 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.020679711096192467 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0332245234911693 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.033102188524237554 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.025348805413392343 - cluster/prob_snapshot/cluster_31:0.019102422342283366 - cluster/prob_snapshot/cluster_32:0.024539044830768464 - cluster/prob_snapshot/cluster_33:0.035772402827017574 - cluster/prob_snapshot/cluster_34:0.029668432043667602 - cluster/prob_snapshot/cluster_35:0.020292070605644377 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.02553874925376091 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.04607688881042193 - cluster/prob_snapshot/cluster_41:0.02361523092258883 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.037304949707464986 - cluster/prob_snapshot/cluster_44:0.030286121135110134 - cluster/prob_snapshot/cluster_45:0.020679711096192467 - cluster/prob_snapshot/cluster_46:0.017211348812341126 - cluster/prob_snapshot/cluster_47:0.02553874925376091 - cluster/prob_snapshot/cluster_48:0.017211348812341126 - cluster/prob_snapshot/cluster_49:0.03974305884539166 - cluster/prob_snapshot/cluster_50:0.017211348812341126 - cluster/prob_snapshot/cluster_51:0.017211348812341126 - cluster/prob_snapshot/cluster_52:0.01863949798804462 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0358306529514683 - cluster/prob_snapshot/cluster_56:0.02702235346195697 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03285153377095082 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   3%|▎         | 24/800 [38:12<22:56:31, 106.43s/it]
[36m(TaskRunner pid=2823680)[0m step:24 - global_seqlen/min:325451 - global_seqlen/max:466870 - global_seqlen/minmax_diff:141419 - global_seqlen/balanced_min:383022 - global_seqlen/balanced_max:383077 - global_seqlen/mean:383058.0 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.29785886095190534) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010947458446025848 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03432691876514582) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002353279397000882) - actor/ppo_kl:np.float64(5.007981692693759e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.21505407415903532) - perf/mfu/actor:np.float64(0.20347466657735336) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.84661483764648) - actor/lr:np.float64(1e-06) - training/global_step:24 - training/epoch:0 - critic/score/mean:0.516581654548645 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5086161494255066 - critic/rewards/max:1.0015895366668701 - critic/rewards/min:-0.0839276909828186 - critic/advantages/mean:-0.24083811044692993 - critic/advantages/max:2.474853277206421 - critic/advantages/min:-2.474851369857788 - critic/returns/mean:-0.24083811044692993 - critic/returns/max:2.474853277206421 - critic/returns/min:-2.474851369857788 - response_length/mean:1219.5152587890625 - response_length/max:8192.0 - response_length/min:153.0 - response_length/clip_ratio:0.01658163219690323 - response_length_non_aborted/mean:1219.5152587890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:153.0 - response_length_non_aborted/clip_ratio:0.01658163219690323 - response/aborted_ratio:0.0 - prompt_length/mean:221.551025390625 - prompt_length/max:409.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.760392665863037e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2557712150737643) - timing_s/agent_loop/generate_sequences/max:np.float64(28.893183081410825) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.45833554908495) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.893183081410825) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:174 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.511432906612754 - timing_s/reward:0.0001411791890859604 - timing_s/old_log_prob:11.302465735934675 - timing_s/ref:20.58377182483673 - timing_s/adv:0.07923853490501642 - timing_s/update_actor:22.218658975325525 - timing_s/update_weights:28.358227286487818 - timing_s/step:113.47663767356426 - timing_s/stop_profile:5.2354298532009125e-05 - timing_per_token_ms/adv:7.01352588476295e-05 - timing_per_token_ms/update_actor:0.019666080403298935 - timing_per_token_ms/gen:0.03191238668195037 - timing_per_token_ms/ref:0.01821901637537815 - perf/total_num_tokens:1532232 - perf/time_per_step:113.47663767356426 - perf/throughput:3375.655182011424 - frontier/active_count:36.0 - frontier/completed_count:28.0 - frontier/blacklisted_count:1376.0 - frontier/mean_score:1.9888836225649915 - frontier/mean_frontier_pct:0.4838685178371594 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:24.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:2.9938699999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:1.983053867201 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.4351956069999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:3.846068306638699 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:1.49570801 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.7680699999999998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:1.9630915048999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:1.3763542999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:1.5340999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:2.822350009999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.2401 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.575709 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.3850556069999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8264142999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.3763542999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.7680699999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.7042136069999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.3963542999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.8400999999999998 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:3.2239297108999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:1.7015080099999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.1815089999999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.4275080099999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:1.2401 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:1.8400999999999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.8635389248999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.7071515048999992 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:1.9469956069999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.5568969248999998 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:24.0 - cluster/prob_snapshot/cluster_0:0.04181393753361152 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0276963563978828 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.02004468445844732 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.05371618003580437 - cluster/prob_snapshot/cluster_11:0.02088983198958619 - cluster/prob_snapshot/cluster_12:0.02469378046977742 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.027417551716892193 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.020420026136661714 - cluster/prob_snapshot/cluster_17:0.019222876318717115 - cluster/prob_snapshot/cluster_18:0.02142603438703532 - cluster/prob_snapshot/cluster_19:0.03941840060394333 - cluster/prob_snapshot/cluster_20:0.017319878263061406 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.018757032906452278 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.03597368463919977 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0333109207030659 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02550864715258004 - cluster/prob_snapshot/cluster_31:0.019222876318717115 - cluster/prob_snapshot/cluster_32:0.02469378046977742 - cluster/prob_snapshot/cluster_33:0.03776844647250558 - cluster/prob_snapshot/cluster_34:0.033468724095769474 - cluster/prob_snapshot/cluster_35:0.020420026136661714 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.02569978872015103 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.04502707049548806 - cluster/prob_snapshot/cluster_41:0.023764141276367924 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.030468083468891873 - cluster/prob_snapshot/cluster_44:0.03390383292944637 - cluster/prob_snapshot/cluster_45:0.020810110968439235 - cluster/prob_snapshot/cluster_46:0.017319878263061406 - cluster/prob_snapshot/cluster_47:0.02569978872015103 - cluster/prob_snapshot/cluster_48:0.017319878263061406 - cluster/prob_snapshot/cluster_49:0.03999366630175448 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.017319878263061406 - cluster/prob_snapshot/cluster_52:0.018757032906452278 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.03780947867472903 - cluster/prob_snapshot/cluster_56:0.027192748078344767 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03571094546444969 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_25
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 25}
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 12:13:22,073:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Updated best checkpoint at step 25: val-core/aime2025/acc/best@16/mean=0.212933
[36m(TaskRunner pid=2823680)[0m Training Progress:   3%|▎         | 25/800 [43:00<34:38:36, 160.92s/it]
[36m(TaskRunner pid=2823680)[0m step:25 - global_seqlen/min:324344 - global_seqlen/max:399792 - global_seqlen/minmax_diff:75448 - global_seqlen/balanced_min:362252 - global_seqlen/balanced_max:362638 - global_seqlen/mean:362395.0 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.3146896403889323) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011845486238598824 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07337228537653573) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00024657120843056513) - actor/ppo_kl:np.float64(5.39217600928253e-05) - actor/pg_clipfrac_lower:np.float64(9.089983291761559e-07) - actor/grad_norm:np.float64(0.23699502511457962) - perf/mfu/actor:np.float64(0.23310181390873697) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(112.66510772705078) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.06666666666666667) - val-aux/aime2024/reward/std@16:np.float64(0.11800799734358997) - val-aux/aime2024/reward/best@2/mean:np.float64(0.11456666666666666) - val-aux/aime2024/reward/best@2/std:np.float64(0.1355867324028649) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.018266666666666664) - val-aux/aime2024/reward/worst@2/std:np.float64(0.06368976888977204) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.06793333333333333) - val-aux/aime2024/reward/maj@2/std:np.float64(0.11882001972711712) - val-aux/aime2024/reward/best@4/mean:np.float64(0.178) - val-aux/aime2024/reward/best@4/std:np.float64(0.13288997773471006) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.0019333333333333336) - val-aux/aime2024/reward/worst@4/std:np.float64(0.01565266185887839) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.07816666666666668) - val-aux/aime2024/reward/maj@4/std:np.float64(0.11979329044899127) - val-aux/aime2024/reward/best@8/mean:np.float64(0.23813333333333334) - val-aux/aime2024/reward/best@8/std:np.float64(0.10163921527150084) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0) - val-aux/aime2024/reward/worst@8/std:np.float64(0.0) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.09269999999999999) - val-aux/aime2024/reward/maj@8/std:np.float64(0.11636814371989081) - val-aux/aime2024/reward/best@16/mean:np.float64(0.27726666666666666) - val-aux/aime2024/reward/best@16/std:np.float64(0.05696037024219323) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.10986666666666665) - val-aux/aime2024/reward/maj@16/std:np.float64(0.10616423558353745) - val-aux/aime2024/score/mean@16:np.float64(0.06666666666666667) - val-aux/aime2024/score/std@16:np.float64(0.11800799734358997) - val-aux/aime2024/score/best@2/mean:np.float64(0.11456666666666666) - val-aux/aime2024/score/best@2/std:np.float64(0.1355867324028649) - val-aux/aime2024/score/worst@2/mean:np.float64(0.018266666666666664) - val-aux/aime2024/score/worst@2/std:np.float64(0.06368976888977204) - val-aux/aime2024/score/maj@2/mean:np.float64(0.06793333333333333) - val-aux/aime2024/score/maj@2/std:np.float64(0.11882001972711712) - val-aux/aime2024/score/best@4/mean:np.float64(0.178) - val-aux/aime2024/score/best@4/std:np.float64(0.13288997773471006) - val-aux/aime2024/score/worst@4/mean:np.float64(0.0019333333333333336) - val-aux/aime2024/score/worst@4/std:np.float64(0.01565266185887839) - val-aux/aime2024/score/maj@4/mean:np.float64(0.07816666666666668) - val-aux/aime2024/score/maj@4/std:np.float64(0.11979329044899127) - val-aux/aime2024/score/best@8/mean:np.float64(0.23813333333333334) - val-aux/aime2024/score/best@8/std:np.float64(0.10163921527150084) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0) - val-aux/aime2024/score/worst@8/std:np.float64(0.0) - val-aux/aime2024/score/maj@8/mean:np.float64(0.09269999999999999) - val-aux/aime2024/score/maj@8/std:np.float64(0.11636814371989081) - val-aux/aime2024/score/best@16/mean:np.float64(0.27726666666666666) - val-aux/aime2024/score/best@16/std:np.float64(0.05696037024219323) - val-aux/aime2024/score/worst@16/mean:np.float64(0.0) - val-aux/aime2024/score/worst@16/std:np.float64(0.0) - val-aux/aime2024/score/maj@16/mean:np.float64(0.10986666666666665) - val-aux/aime2024/score/maj@16/std:np.float64(0.10616423558353745) - val-core/aime2024/acc/mean@16:np.float64(0.06666666666666667) - val-aux/aime2024/acc/std@16:np.float64(0.11800799734358997) - val-aux/aime2024/acc/best@2/mean:np.float64(0.11456666666666666) - val-aux/aime2024/acc/best@2/std:np.float64(0.1355867324028649) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.018266666666666664) - val-aux/aime2024/acc/worst@2/std:np.float64(0.06368976888977204) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.06793333333333333) - val-aux/aime2024/acc/maj@2/std:np.float64(0.11882001972711712) - val-aux/aime2024/acc/best@4/mean:np.float64(0.178) - val-aux/aime2024/acc/best@4/std:np.float64(0.13288997773471006) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.0019333333333333336) - val-aux/aime2024/acc/worst@4/std:np.float64(0.01565266185887839) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.07816666666666668) - val-aux/aime2024/acc/maj@4/std:np.float64(0.11979329044899127) - val-aux/aime2024/acc/best@8/mean:np.float64(0.23813333333333334) - val-aux/aime2024/acc/best@8/std:np.float64(0.10163921527150084) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0) - val-aux/aime2024/acc/worst@8/std:np.float64(0.0) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.09269999999999999) - val-aux/aime2024/acc/maj@8/std:np.float64(0.11636814371989081) - val-core/aime2024/acc/best@16/mean:np.float64(0.27726666666666666) - val-core/aime2024/acc/best@16/std:np.float64(0.05696037024219323) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0) - val-core/aime2024/acc/maj@16/mean:np.float64(0.10986666666666665) - val-core/aime2024/acc/maj@16/std:np.float64(0.10616423558353745) - val-aux/aime2025/reward/mean@16:np.float64(0.04583333333333333) - val-aux/aime2025/reward/std@16:np.float64(0.08727893993505527) - val-aux/aime2025/reward/best@2/mean:np.float64(0.07676666666666666) - val-aux/aime2025/reward/best@2/std:np.float64(0.10360897560420991) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.014766666666666668) - val-aux/aime2025/reward/worst@2/std:np.float64(0.040741310096092914) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.04643333333333333) - val-aux/aime2025/reward/maj@2/std:np.float64(0.08760512729638062) - val-aux/aime2025/reward/best@4/mean:np.float64(0.1172) - val-aux/aime2025/reward/best@4/std:np.float64(0.11231552236909025) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0035000000000000005) - val-aux/aime2025/reward/worst@4/std:np.float64(0.013548585583081979) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.050666666666666665) - val-aux/aime2025/reward/maj@4/std:np.float64(0.08329092540400852) - val-aux/aime2025/reward/best@8/mean:np.float64(0.16486666666666666) - val-aux/aime2025/reward/best@8/std:np.float64(0.10603418787427207) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0004333333333333333) - val-aux/aime2025/reward/worst@8/std:np.float64(0.003775800135953526) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.05453333333333333) - val-aux/aime2025/reward/maj@8/std:np.float64(0.07637921379552357) - val-aux/aime2025/reward/best@16/mean:np.float64(0.21293333333333334) - val-aux/aime2025/reward/best@16/std:np.float64(0.08425793327069489) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.057466666666666666) - val-aux/aime2025/reward/maj@16/std:np.float64(0.066636312266791) - val-aux/aime2025/score/mean@16:np.float64(0.04583333333333333) - val-aux/aime2025/score/std@16:np.float64(0.08727893993505527) - val-aux/aime2025/score/best@2/mean:np.float64(0.07676666666666666) - val-aux/aime2025/score/best@2/std:np.float64(0.10360897560420991) - val-aux/aime2025/score/worst@2/mean:np.float64(0.014766666666666668) - val-aux/aime2025/score/worst@2/std:np.float64(0.040741310096092914) - val-aux/aime2025/score/maj@2/mean:np.float64(0.04643333333333333) - val-aux/aime2025/score/maj@2/std:np.float64(0.08760512729638062) - val-aux/aime2025/score/best@4/mean:np.float64(0.1172) - val-aux/aime2025/score/best@4/std:np.float64(0.11231552236909025) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0035000000000000005) - val-aux/aime2025/score/worst@4/std:np.float64(0.013548585583081979) - val-aux/aime2025/score/maj@4/mean:np.float64(0.050666666666666665) - val-aux/aime2025/score/maj@4/std:np.float64(0.08329092540400852) - val-aux/aime2025/score/best@8/mean:np.float64(0.16486666666666666) - val-aux/aime2025/score/best@8/std:np.float64(0.10603418787427207) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0004333333333333333) - val-aux/aime2025/score/worst@8/std:np.float64(0.003775800135953526) - val-aux/aime2025/score/maj@8/mean:np.float64(0.05453333333333333) - val-aux/aime2025/score/maj@8/std:np.float64(0.07637921379552357) - val-aux/aime2025/score/best@16/mean:np.float64(0.21293333333333334) - val-aux/aime2025/score/best@16/std:np.float64(0.08425793327069489) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.057466666666666666) - val-aux/aime2025/score/maj@16/std:np.float64(0.066636312266791) - val-core/aime2025/acc/mean@16:np.float64(0.04583333333333333) - val-aux/aime2025/acc/std@16:np.float64(0.08727893993505527) - val-aux/aime2025/acc/best@2/mean:np.float64(0.07676666666666666) - val-aux/aime2025/acc/best@2/std:np.float64(0.10360897560420991) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.014766666666666668) - val-aux/aime2025/acc/worst@2/std:np.float64(0.040741310096092914) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.04643333333333333) - val-aux/aime2025/acc/maj@2/std:np.float64(0.08760512729638062) - val-aux/aime2025/acc/best@4/mean:np.float64(0.1172) - val-aux/aime2025/acc/best@4/std:np.float64(0.11231552236909025) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0035000000000000005) - val-aux/aime2025/acc/worst@4/std:np.float64(0.013548585583081979) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.050666666666666665) - val-aux/aime2025/acc/maj@4/std:np.float64(0.08329092540400852) - val-aux/aime2025/acc/best@8/mean:np.float64(0.16486666666666666) - val-aux/aime2025/acc/best@8/std:np.float64(0.10603418787427207) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0004333333333333333) - val-aux/aime2025/acc/worst@8/std:np.float64(0.003775800135953526) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.05453333333333333) - val-aux/aime2025/acc/maj@8/std:np.float64(0.07637921379552357) - val-core/aime2025/acc/best@16/mean:np.float64(0.21293333333333334) - val-core/aime2025/acc/best@16/std:np.float64(0.08425793327069489) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.057466666666666666) - val-core/aime2025/acc/maj@16/std:np.float64(0.066636312266791) - val-aux/math500/reward/mean@4:np.float64(0.625) - val-aux/math500/reward/std@4:np.float64(0.14660254037844384) - val-aux/math500/reward/best@2/mean:np.float64(0.691692) - val-aux/math500/reward/best@2/std:np.float64(0.12500096594357882) - val-aux/math500/reward/worst@2/mean:np.float64(0.55822) - val-aux/math500/reward/worst@2/std:np.float64(0.12735185671929375) - val-aux/math500/reward/maj@2/mean:np.float64(0.6250319999999999) - val-aux/math500/reward/maj@2/std:np.float64(0.14653430770550413) - val-aux/math500/reward/best@4/mean:np.float64(0.743308) - val-aux/math500/reward/best@4/std:np.float64(0.07959905342859241) - val-aux/math500/reward/worst@4/mean:np.float64(0.504978) - val-aux/math500/reward/worst@4/std:np.float64(0.08504957878209168) - val-aux/math500/reward/maj@4/mean:np.float64(0.63914) - val-aux/math500/reward/maj@4/std:np.float64(0.1358202455326984) - val-aux/math500/score/mean@4:np.float64(0.625) - val-aux/math500/score/std@4:np.float64(0.14660254037844384) - val-aux/math500/score/best@2/mean:np.float64(0.691692) - val-aux/math500/score/best@2/std:np.float64(0.12500096594357882) - val-aux/math500/score/worst@2/mean:np.float64(0.55822) - val-aux/math500/score/worst@2/std:np.float64(0.12735185671929375) - val-aux/math500/score/maj@2/mean:np.float64(0.6250319999999999) - val-aux/math500/score/maj@2/std:np.float64(0.14653430770550413) - val-aux/math500/score/best@4/mean:np.float64(0.743308) - val-aux/math500/score/best@4/std:np.float64(0.07959905342859241) - val-aux/math500/score/worst@4/mean:np.float64(0.504978) - val-aux/math500/score/worst@4/std:np.float64(0.08504957878209168) - val-aux/math500/score/maj@4/mean:np.float64(0.63914) - val-aux/math500/score/maj@4/std:np.float64(0.1358202455326984) - val-core/math500/acc/mean@4:np.float64(0.625) - val-aux/math500/acc/std@4:np.float64(0.14660254037844384) - val-aux/math500/acc/best@2/mean:np.float64(0.691692) - val-aux/math500/acc/best@2/std:np.float64(0.12500096594357882) - val-aux/math500/acc/worst@2/mean:np.float64(0.55822) - val-aux/math500/acc/worst@2/std:np.float64(0.12735185671929375) - val-aux/math500/acc/maj@2/mean:np.float64(0.6250319999999999) - val-aux/math500/acc/maj@2/std:np.float64(0.14653430770550413) - val-core/math500/acc/best@4/mean:np.float64(0.743308) - val-core/math500/acc/best@4/std:np.float64(0.07959905342859241) - val-aux/math500/acc/worst@4/mean:np.float64(0.504978) - val-aux/math500/acc/worst@4/std:np.float64(0.08504957878209168) - val-core/math500/acc/maj@4/mean:np.float64(0.63914) - val-core/math500/acc/maj@4/std:np.float64(0.1358202455326984) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.06621621621621622 - val-aux/aime2024/response_length/clip_ratio:0.13958333333333334 - val-aux/aime2025/response_length/clip_ratio:0.11666666666666667 - val-aux/math500/response_length/clip_ratio:0.0365 - val-best/metric:0.21293333333333334 - val-best/step:25.0 - training/global_step:25 - training/epoch:0 - critic/score/mean:0.5191176533699036 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5110699534416199 - critic/rewards/max:1.0006113052368164 - critic/rewards/min:-0.0634409710764885 - critic/advantages/mean:-0.16582578420639038 - critic/advantages/max:2.474853277206421 - critic/advantages/min:-2.474848747253418 - critic/returns/mean:-0.16582578420639038 - critic/returns/max:2.474853277206421 - critic/returns/min:-2.474848747253418 - response_length/mean:1114.01025390625 - response_length/max:8192.0 - response_length/min:136.0 - response_length/clip_ratio:0.01617647148668766 - response_length_non_aborted/mean:1114.01025390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:136.0 - response_length_non_aborted/clip_ratio:0.01617647148668766 - response/aborted_ratio:0.0 - prompt_length/mean:239.78823852539062 - prompt_length/max:879.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00012154504656791687 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.114282657392323) - timing_s/agent_loop/generate_sequences/max:np.float64(27.914898935705423) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.295512249706917) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.914898935705423) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:198 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.4115847973153 - timing_s/reward:0.00024963170289993286 - timing_s/old_log_prob:9.292845944873989 - timing_s/ref:18.86300225649029 - timing_s/adv:0.0679437043145299 - timing_s/update_actor:18.229790003038943 - timing_s/save_checkpoint:52.179411485791206 - timing_s/update_weights:25.979535387828946 - timing_s/step:155.4221176598221 - timing_s/testing:132.37399765104055 - timing_s/stop_profile:0.00010551325976848602 - timing_per_token_ms/adv:7.380508255586937e-05 - timing_per_token_ms/update_actor:0.019802440413345612 - timing_per_token_ms/gen:0.0401458757210176 - timing_per_token_ms/ref:0.02049027872173426 - perf/total_num_tokens:1449580 - perf/time_per_step:155.4221176598221 - perf/throughput:2331.6822950076307 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1417.0 - frontier/mean_score:1.9569121605053659 - frontier/mean_frontier_pct:0.45769409783984594 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:24.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:2.9957089999999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.3046369248999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:3.5922478146470893 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:1.9469956069999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.7680699999999998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:1.9630915048999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:1.3763542999999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:1.9738699999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:2.8756450069999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.7680699999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.575709 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.3850556069999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8264142999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.3763542999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.7680699999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.7042136069999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:1.9774480099999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:1.5880699999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:1.7015080099999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.1815089999999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.5992556069999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.343 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:1.2401 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:1.8400999999999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.8635389248999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.2628969248999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:2.6898278474299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:25.0 - cluster/prob_snapshot/cluster_0:0.04638892968124701 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.020202466450761555 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.055626408062758134 - cluster/prob_snapshot/cluster_11:0.030149471227953 - cluster/prob_snapshot/cluster_12:0.027378785757068664 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.030398718226189354 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0226403373689036 - cluster/prob_snapshot/cluster_17:0.021313019001238758 - cluster/prob_snapshot/cluster_18:0.030565624574991446 - cluster/prob_snapshot/cluster_19:0.04452972368743161 - cluster/prob_snapshot/cluster_20:0.027378785757068664 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.020796523481391133 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.03988517699160869 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.03693288528324577 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02828225456194977 - cluster/prob_snapshot/cluster_31:0.021313019001238758 - cluster/prob_snapshot/cluster_32:0.027378785757068664 - cluster/prob_snapshot/cluster_33:0.04187508695210194 - cluster/prob_snapshot/cluster_34:0.030621030508708236 - cluster/prob_snapshot/cluster_35:0.0226403373689036 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.024591463175795095 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.026348064991615853 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03378094053861956 - cluster/prob_snapshot/cluster_44:0.04024979915496132 - cluster/prob_snapshot/cluster_45:0.020796523481391133 - cluster/prob_snapshot/cluster_46:0.01920310407242974 - cluster/prob_snapshot/cluster_47:0.028494179343341638 - cluster/prob_snapshot/cluster_48:0.01920310407242974 - cluster/prob_snapshot/cluster_49:0.044342259487386716 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01920310407242974 - cluster/prob_snapshot/cluster_52:0.020796523481391133 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.03504124276593495 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04165232166044509 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   3%|▎         | 26/800 [44:46<31:06:05, 144.66s/it]
[36m(TaskRunner pid=2823680)[0m step:26 - global_seqlen/min:340825 - global_seqlen/max:465907 - global_seqlen/minmax_diff:125082 - global_seqlen/balanced_min:387969 - global_seqlen/balanced_max:388105 - global_seqlen/mean:388039.5 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.3168310701314892) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010239139199256897 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08195427290047519) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00020708732097860083) - actor/ppo_kl:np.float64(3.7058631321143376e-05) - actor/pg_clipfrac_lower:np.float64(3.6430297941911876e-06) - actor/grad_norm:np.float64(0.2281945442611521) - perf/mfu/actor:np.float64(0.2569922488946057) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(115.19398498535156) - actor/lr:np.float64(1e-06) - training/global_step:26 - training/epoch:0 - critic/score/mean:0.5565476417541504 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5487276911735535 - critic/rewards/max:1.0011399984359741 - critic/rewards/min:-0.06712307780981064 - critic/advantages/mean:-0.17545436322689056 - critic/advantages/max:2.474839925765991 - critic/advantages/min:-2.4748384952545166 - critic/returns/mean:-0.17545436322689056 - critic/returns/max:2.474839925765991 - critic/returns/min:-2.4748384952545166 - response_length/mean:1116.8824462890625 - response_length/max:8192.0 - response_length/min:149.0 - response_length/clip_ratio:0.0059523810632526875 - response_length_non_aborted/mean:1116.8824462890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:149.0 - response_length_non_aborted/clip_ratio:0.0059523810632526875 - response/aborted_ratio:0.0 - prompt_length/mean:226.6904754638672 - prompt_length/max:527.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.205083966255188e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1592975333333015) - timing_s/agent_loop/generate_sequences/max:np.float64(28.374419895000756) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.777956181381342) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.374419895000756) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:325 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.812639982439578 - timing_s/reward:0.0002212878316640854 - timing_s/old_log_prob:9.053415915928781 - timing_s/ref:19.74567369930446 - timing_s/adv:0.1033912030979991 - timing_s/update_actor:17.817653188481927 - timing_s/update_weights:28.422626612707973 - timing_s/step:106.40835126396269 - timing_s/stop_profile:8.151214569807053e-05 - timing_per_token_ms/adv:0.00011451254716623686 - timing_per_token_ms/update_actor:0.01973422099754223 - timing_per_token_ms/gen:0.04105368763024146 - timing_per_token_ms/ref:0.021869630327035858 - perf/total_num_tokens:1552158 - perf/time_per_step:106.40835126396269 - perf/throughput:3646.701554818821 - frontier/active_count:32.0 - frontier/completed_count:32.0 - frontier/blacklisted_count:1461.0 - frontier/mean_score:2.006456815931686 - frontier/mean_frontier_pct:0.49882836800523245 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:25.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:2.9969962999999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.3046369248999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:3.414573470252962 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:1.9469956069999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.7680699999999998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.27416405343 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:1.8634480099999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:2.2817089999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:2.912951504899999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.7680699999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.575709 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:1.9695389248999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8264142999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.3763542999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.7680699999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.792949524899999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:1.6842136069999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:1.5880699999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:1.7015080099999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.4270562999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.5992556069999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:1.2401 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.1880699999999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:2.9044772474299996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.2628969248999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:2.7828794932009995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:26.0 - cluster/prob_snapshot/cluster_0:0.04667737358280065 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.020319352791150776 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.053181020442673754 - cluster/prob_snapshot/cluster_11:0.030323908411902523 - cluster/prob_snapshot/cluster_12:0.027537192458509996 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.035419464852368465 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.022771328611318392 - cluster/prob_snapshot/cluster_17:0.0290226781110462 - cluster/prob_snapshot/cluster_18:0.035536975270953286 - cluster/prob_snapshot/cluster_19:0.04536839956152053 - cluster/prob_snapshot/cluster_20:0.027537192458509996 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.019314208355889897 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0401159424966864 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.030675014241233752 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02844588850445673 - cluster/prob_snapshot/cluster_31:0.0214363307166559 - cluster/prob_snapshot/cluster_32:0.027537192458509996 - cluster/prob_snapshot/cluster_33:0.04349940250899305 - cluster/prob_snapshot/cluster_34:0.026231152746893683 - cluster/prob_snapshot/cluster_35:0.022771328611318392 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.024733743136632583 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.026500508204463816 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03780071854662947 - cluster/prob_snapshot/cluster_44:0.040482674271278975 - cluster/prob_snapshot/cluster_45:0.019314208355889897 - cluster/prob_snapshot/cluster_46:0.019314208355889897 - cluster/prob_snapshot/cluster_47:0.034078574209557295 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.04523641538731117 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.019314208355889897 - cluster/prob_snapshot/cluster_52:0.020916846884896485 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.03524398249771883 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04334256460044947 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 12:17:10,534:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   3%|▎         | 27/800 [46:32<28:34:30, 133.08s/it]
[36m(TaskRunner pid=2823680)[0m step:27 - global_seqlen/min:354918 - global_seqlen/max:416089 - global_seqlen/minmax_diff:61171 - global_seqlen/balanced_min:392335 - global_seqlen/balanced_max:392415 - global_seqlen/mean:392387.75 - frontier/skipped_zero_acc_count:42.0 - actor/entropy:np.float64(0.35459426703841185) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009744239039719105 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.02616776601644233) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00024038862711465701) - actor/ppo_kl:np.float64(-2.61012675452527e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.24465521628206427) - perf/mfu/actor:np.float64(0.2424983262074905) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(114.93108749389648) - actor/lr:np.float64(1e-06) - training/global_step:27 - training/epoch:0 - critic/score/mean:0.4505814015865326 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.443263441324234 - critic/rewards/max:1.000882863998413 - critic/rewards/min:-0.10101017355918884 - critic/advantages/mean:-0.1433938890695572 - critic/advantages/max:2.474853992462158 - critic/advantages/min:-2.474860668182373 - critic/returns/mean:-0.1433938890695572 - critic/returns/max:2.474853992462158 - critic/returns/min:-2.474860668182373 - response_length/mean:1140.4520263671875 - response_length/max:8192.0 - response_length/min:179.0 - response_length/clip_ratio:0.007267442066222429 - response_length_non_aborted/mean:1140.4520263671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:179.0 - response_length_non_aborted/clip_ratio:0.007267442066222429 - response/aborted_ratio:0.0 - prompt_length/mean:236.97674560546875 - prompt_length/max:366.0 - prompt_length/min:183.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.400995284318924e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.389139604754746) - timing_s/agent_loop/generate_sequences/max:np.float64(28.165527399629354) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.930669700441285) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.165527399629354) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:358 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.60052544437349 - timing_s/reward:0.0001975884661078453 - timing_s/old_log_prob:9.21519586071372 - timing_s/ref:19.746768199838698 - timing_s/adv:0.10409737285226583 - timing_s/update_actor:19.046727016568184 - timing_s/update_weights:27.719535957090557 - timing_s/step:105.83768767584115 - timing_s/stop_profile:6.265006959438324e-05 - timing_per_token_ms/adv:0.00010984547680816004 - timing_per_token_ms/update_actor:0.020098459292906698 - timing_per_token_ms/gen:0.03772540907047196 - timing_per_token_ms/ref:0.02083715572159399 - perf/total_num_tokens:1569551 - perf/time_per_step:105.83768767584115 - perf/throughput:3707.4482503983095 - frontier/active_count:30.0 - frontier/completed_count:34.0 - frontier/blacklisted_count:1503.0 - frontier/mean_score:2.027750670612302 - frontier/mean_frontier_pct:0.5405625575178837 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:27.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:2.9969962999999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.3046369248999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:3.290201429177073 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:1.6628969248999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.7680699999999998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.27416405343 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:1.6044136069999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:2.2817089999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:2.939066053429999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.7680699999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:2.7029962999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.2786772474299997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5784900099999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.3763542999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.5376489999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.792949524899999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:1.6842136069999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:1.46207 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:1.5880699999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:1.7015080099999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:1.9989394099999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.1194789248999992 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:1.7680699999999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.4316489999999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:2.9331340732009994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.2628969248999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:2.7828794932009995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:27.0 - cluster/prob_snapshot/cluster_0:0.049266351191244215 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.021446373130053018 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.054086225965568044 - cluster/prob_snapshot/cluster_11:0.02733565733696884 - cluster/prob_snapshot/cluster_12:0.029064552916099086 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03738401843298434 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02403434868644397 - cluster/prob_snapshot/cluster_17:0.026374274875972614 - cluster/prob_snapshot/cluster_18:0.03750804660994165 - cluster/prob_snapshot/cluster_19:0.048314060435291996 - cluster/prob_snapshot/cluster_20:0.029064552916099086 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.04443340987255597 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.037458208915167525 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02594812785872662 - cluster/prob_snapshot/cluster_31:0.02262530464498041 - cluster/prob_snapshot/cluster_32:0.02527675981544104 - cluster/prob_snapshot/cluster_33:0.04591211278877524 - cluster/prob_snapshot/cluster_34:0.02768607323390228 - cluster/prob_snapshot/cluster_35:0.02403434868644397 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.02610560925159608 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027970368590503454 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03285971723858268 - cluster/prob_snapshot/cluster_44:0.03484121520489168 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.029064552916099086 - cluster/prob_snapshot/cluster_47:0.039972846682472654 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.0482165471279778 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.020385477990834344 - cluster/prob_snapshot/cluster_52:0.020385477990834344 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0371988028853122 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.045746575808235795 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▎         | 28/800 [48:19<26:49:31, 125.09s/it]
[36m(TaskRunner pid=2823680)[0m step:28 - global_seqlen/min:314898 - global_seqlen/max:451812 - global_seqlen/minmax_diff:136914 - global_seqlen/balanced_min:378309 - global_seqlen/balanced_max:378397 - global_seqlen/mean:378362.0 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.34965403174812143) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009907991625368595 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08105287980288267) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021113708321122843) - actor/ppo_kl:np.float64(2.2204138306911794e-05) - actor/pg_clipfrac_lower:np.float64(6.705831765430048e-07) - actor/grad_norm:np.float64(0.2460102899508043) - perf/mfu/actor:np.float64(0.2325709586965181) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(115.32146453857422) - actor/lr:np.float64(1e-06) - training/global_step:28 - training/epoch:0 - critic/score/mean:0.5241477489471436 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5168635845184326 - critic/rewards/max:1.0033060312271118 - critic/rewards/min:-0.06638364493846893 - critic/advantages/mean:-0.20869600772857666 - critic/advantages/max:2.4748592376708984 - critic/advantages/min:-2.4748618602752686 - critic/returns/mean:-0.20869600772857666 - critic/returns/max:2.4748592376708984 - critic/returns/min:-2.4748618602752686 - response_length/mean:1086.3480224609375 - response_length/max:8192.0 - response_length/min:116.0 - response_length/clip_ratio:0.008522727526724339 - response_length_non_aborted/mean:1086.3480224609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:116.0 - response_length_non_aborted/clip_ratio:0.008522727526724339 - response/aborted_ratio:0.0 - prompt_length/mean:246.03408813476562 - prompt_length/max:886.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.848829358816147e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0765666840597987) - timing_s/agent_loop/generate_sequences/max:np.float64(29.248317963443696) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.418314122533957) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.248317963443696) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:230 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.875646399334073 - timing_s/reward:0.00030075013637542725 - timing_s/old_log_prob:8.882059982046485 - timing_s/ref:20.04110264033079 - timing_s/adv:0.08358672447502613 - timing_s/update_actor:19.30305908806622 - timing_s/update_weights:26.633199743926525 - timing_s/step:106.22917861212045 - timing_s/stop_profile:7.472001016139984e-05 - timing_per_token_ms/adv:8.911193156803927e-05 - timing_per_token_ms/update_actor:0.020579020069431162 - timing_per_token_ms/gen:0.040371457224586224 - timing_per_token_ms/ref:0.021365849400723873 - perf/total_num_tokens:1513448 - perf/time_per_step:106.22917861212045 - perf/throughput:3561.7520999718054 - frontier/active_count:28.0 - frontier/completed_count:36.0 - frontier/blacklisted_count:1543.0 - frontier/mean_score:2.050683460267124 - frontier/mean_frontier_pct:0.5707886468699277 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:28.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:2.9969962999999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.3046369248999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:3.290201429177073 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:1.6628969248999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.1376489999999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.27416405343 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.46207 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:1.6044136069999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:2.2817089999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:112.0 - frontier/cluster_19/score:2.957346237400999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.7680699999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:2.7029962999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:1.4049430069999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.3763542999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:1.3763542999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:2.2550646674299992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.0789495248999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:1.3234489999999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.411649 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:1.7015080099999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:1.9989394099999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.383635247429999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:1.5376489999999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:2.6021542999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:2.9531938512406994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.4840278474299997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:2.8480156452406993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:28.0 - cluster/prob_snapshot/cluster_0:0.05219507750304602 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.022721291116872964 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0573014783490157 - cluster/prob_snapshot/cluster_11:0.02896067435075993 - cluster/prob_snapshot/cluster_12:0.03722885985188198 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03960637823390712 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.025463113506305802 - cluster/prob_snapshot/cluster_17:0.027942140790866717 - cluster/prob_snapshot/cluster_18:0.039737779487548135 - cluster/prob_snapshot/cluster_19:0.05150454008384553 - cluster/prob_snapshot/cluster_20:0.03079234721804982 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.047074833348625306 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.024468201424782385 - cluster/prob_snapshot/cluster_31:0.023970306323084438 - cluster/prob_snapshot/cluster_32:0.023970306323084438 - cluster/prob_snapshot/cluster_33:0.039273747215133215 - cluster/prob_snapshot/cluster_34:0.03620656174219375 - cluster/prob_snapshot/cluster_35:0.02304891838749643 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.024584991633822648 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02963311714932835 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.034813121867665664 - cluster/prob_snapshot/cluster_44:0.04151290626504982 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.026779381985717242 - cluster/prob_snapshot/cluster_47:0.045318589603659 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.051432222971722494 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.021597329169718158 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.04326132335109438 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04960046074572263 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▎         | 29/800 [50:05<25:33:43, 119.36s/it]
[36m(TaskRunner pid=2823680)[0m step:29 - global_seqlen/min:316784 - global_seqlen/max:412365 - global_seqlen/minmax_diff:95581 - global_seqlen/balanced_min:366176 - global_seqlen/balanced_max:366218 - global_seqlen/mean:366201.25 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.3159872452710189) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01061810739338398 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.00011411038576625288) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002257620280408232) - actor/ppo_kl:np.float64(3.6102306698832728e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.24685022578789637) - perf/mfu/actor:np.float64(0.20598355147216596) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(114.84773254394531) - actor/lr:np.float64(1e-06) - training/global_step:29 - training/epoch:0 - critic/score/mean:0.5804455280303955 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5731439590454102 - critic/rewards/max:1.0019789934158325 - critic/rewards/min:-0.06778757274150848 - critic/advantages/mean:-0.19434116780757904 - critic/advantages/max:2.474851369857788 - critic/advantages/min:-2.4748570919036865 - critic/returns/mean:-0.19434116780757904 - critic/returns/max:2.474851369857788 - critic/returns/min:-2.4748570919036865 - response_length/mean:1065.6348876953125 - response_length/max:8192.0 - response_length/min:150.0 - response_length/clip_ratio:0.006188118830323219 - response_length_non_aborted/mean:1065.6348876953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:150.0 - response_length_non_aborted/clip_ratio:0.006188118830323219 - response/aborted_ratio:0.0 - prompt_length/mean:231.37623596191406 - prompt_length/max:520.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010266900062561035 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.185388789512217) - timing_s/agent_loop/generate_sequences/max:np.float64(27.61377278994769) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.256006952397911) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.61377278994769) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.943595016375184 - timing_s/reward:0.00043655838817358017 - timing_s/old_log_prob:9.741276838816702 - timing_s/ref:18.96910967770964 - timing_s/adv:0.10943986009806395 - timing_s/update_actor:20.900476661510766 - timing_s/update_weights:25.602419605478644 - timing_s/step:105.69322454743087 - timing_s/stop_profile:9.624753147363663e-05 - timing_per_token_ms/adv:0.00010442884210944236 - timing_per_token_ms/update_actor:0.019943488371981247 - timing_per_token_ms/gen:0.03477636166833929 - timing_per_token_ms/ref:0.018100554566820744 - perf/total_num_tokens:1464805 - perf/time_per_step:105.69322454743087 - perf/throughput:3464.756152232479 - frontier/active_count:26.0 - frontier/completed_count:38.0 - frontier/blacklisted_count:1570.0 - frontier/mean_score:2.088496048215744 - frontier/mean_frontier_pct:0.5798433430781462 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:28.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:2.9978974099999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.3046369248999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:3.203141000423951 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:1.6628969248999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.1376489999999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:2.4919148374009996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.9234489999999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:1.6044136069999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:2.4971962999999997 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.1376489999999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.39209741 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.2834601048999998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.3763542999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:1.3763542999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:1.8785452672009995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:1.3234489999999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.411649 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:2.0910556069999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.2992575869999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.568544673200999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:1.5376489999999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:2.1215080099999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:2.9531938512406994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:2.638819493201 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:2.8480156452406993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:29.0 - cluster/prob_snapshot/cluster_0:0.05520898477972629 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.024026065698449974 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.05898873061827796 - cluster/prob_snapshot/cluster_11:0.030623746733567433 - cluster/prob_snapshot/cluster_12:0.03936673440249482 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.04589085933078875 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.03542204820330385 - cluster/prob_snapshot/cluster_17:0.029546723685000536 - cluster/prob_snapshot/cluster_18:0.04598812222820154 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03936673440249482 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.062468533331178605 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02363607545756878 - cluster/prob_snapshot/cluster_31:0.025346805846905494 - cluster/prob_snapshot/cluster_32:0.025346805846905494 - cluster/prob_snapshot/cluster_33:0.03459510546257381 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.024372507029099433 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.025996789581707484 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03850867504516251 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.04234290229609732 - cluster/prob_snapshot/cluster_44:0.04730206687386312 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.028317193228290412 - cluster/prob_snapshot/cluster_47:0.03906948351316578 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.054385728424485705 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.02283756001688483 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.04859624106902947 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.05244877689477578 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▍         | 30/800 [51:50<24:35:17, 114.96s/it]
[36m(TaskRunner pid=2823680)[0m step:30 - global_seqlen/min:343933 - global_seqlen/max:412280 - global_seqlen/minmax_diff:68347 - global_seqlen/balanced_min:362632 - global_seqlen/balanced_max:362701 - global_seqlen/mean:362666.5 - frontier/skipped_zero_acc_count:41.0 - actor/entropy:np.float64(0.3338037666610696) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012499969452619553 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04236912081978517) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002084665564490768) - actor/ppo_kl:np.float64(2.1992099277667876e-05) - actor/pg_clipfrac_lower:np.float64(2.4684240171071986e-07) - actor/grad_norm:np.float64(0.2521408579566262) - perf/mfu/actor:np.float64(0.23045446043783754) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(114.52683639526367) - actor/lr:np.float64(1e-06) - training/global_step:30 - training/epoch:0 - critic/score/mean:0.5287356376647949 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5210451483726501 - critic/rewards/max:1.0044829845428467 - critic/rewards/min:-0.045539744198322296 - critic/advantages/mean:-0.2973010241985321 - critic/advantages/max:2.4748592376708984 - critic/advantages/min:-2.4748613834381104 - critic/returns/mean:-0.2973010241985321 - critic/returns/max:2.4748592376708984 - critic/returns/min:-2.4748613834381104 - response_length/mean:1051.19970703125 - response_length/max:8192.0 - response_length/min:136.0 - response_length/clip_ratio:0.01149425283074379 - response_length_non_aborted/mean:1051.19970703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:136.0 - response_length_non_aborted/clip_ratio:0.01149425283074379 - response/aborted_ratio:0.0 - prompt_length/mean:227.48275756835938 - prompt_length/max:324.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.363835513591766e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0812251828610897) - timing_s/agent_loop/generate_sequences/max:np.float64(27.812362200580537) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.998637121215324) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.812362200580537) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.98943460173905 - timing_s/reward:0.00012620631605386734 - timing_s/old_log_prob:9.390614837408066 - timing_s/ref:19.298035548999906 - timing_s/adv:0.070543491281569 - timing_s/update_actor:18.53147622384131 - timing_s/update_weights:26.80365296639502 - timing_s/step:104.4615349760279 - timing_s/stop_profile:5.601532757282257e-05 - timing_per_token_ms/adv:7.92656450679062e-05 - timing_per_token_ms/update_actor:0.02082274906242317 - timing_per_token_ms/gen:0.040989611762339215 - timing_per_token_ms/ref:0.021684087483412124 - perf/total_num_tokens:1450666 - perf/time_per_step:104.4615349760279 - perf/throughput:3471.7707343973607 - frontier/active_count:26.0 - frontier/completed_count:38.0 - frontier/blacklisted_count:1611.0 - frontier/mean_score:2.0672585401379666 - frontier/mean_frontier_pct:0.61875694298449 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:28.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:2.9985281869999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.3046369248999998 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:3.7421987002967656 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:1.6628969248999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:1.7963542999999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:2.6443403861806996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:1.6464142999999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:1.4230895248999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.6480374099999997 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.3963542999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.2744681869999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:144.0 - frontier/cluster_30/score:1.1984220734299997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.3763542999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:1.3763542999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:1.8785452672009995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:1.3234489999999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.8881542999999998 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.3637389248999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.5094803108999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.097981271240699 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:1.5376489999999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:2.1215080099999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:2.3672356958684895 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:2.638819493201 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:2.2936109516684895 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:30.0 - cluster/prob_snapshot/cluster_0:0.055787897330254 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.024272891992520567 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.06962395677541983 - cluster/prob_snapshot/cluster_11:0.03093835279565318 - cluster/prob_snapshot/cluster_12:0.03342133973982227 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.049198200173695125 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.030631692017995368 - cluster/prob_snapshot/cluster_17:0.026476713692763814 - cluster/prob_snapshot/cluster_18:0.049266983647585014 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.04458439584957375 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.06092178683512872 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.022296754748106346 - cluster/prob_snapshot/cluster_31:0.02560720046299622 - cluster/prob_snapshot/cluster_32:0.02560720046299622 - cluster/prob_snapshot/cluster_33:0.034950510370788106 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.02462289240899083 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.03512928732461425 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0439775837457706 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.04668911586148885 - cluster/prob_snapshot/cluster_44:0.039033137746779474 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.028608103440172113 - cluster/prob_snapshot/cluster_47:0.0394708549215287 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.04404264149664427 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.023072176469504706 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.04909548344351459 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.042672846245693105 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 12:24:10,562:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 12:24:17,255:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▍         | 31/800 [53:35<23:55:54, 112.03s/it]
[36m(TaskRunner pid=2823680)[0m step:31 - global_seqlen/min:310172 - global_seqlen/max:399592 - global_seqlen/minmax_diff:89420 - global_seqlen/balanced_min:370066 - global_seqlen/balanced_max:370157 - global_seqlen/mean:370099.25 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.335066525620364) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009693790227174759 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.12066784763374017) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002699119145074205) - actor/ppo_kl:np.float64(4.522204122146095e-05) - actor/pg_clipfrac_lower:np.float64(2.454488865277242e-07) - actor/grad_norm:np.float64(0.2394497895782644) - perf/mfu/actor:np.float64(0.24207505669444507) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(114.64792251586914) - actor/lr:np.float64(1e-06) - training/global_step:31 - training/epoch:0 - critic/score/mean:0.5684523582458496 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5614490509033203 - critic/rewards/max:1.0070412158966064 - critic/rewards/min:-0.05347701534628868 - critic/advantages/mean:-0.18752846121788025 - critic/advantages/max:2.474841356277466 - critic/advantages/min:-2.4748616218566895 - critic/returns/mean:-0.18752846121788025 - critic/returns/max:2.474841356277466 - critic/returns/min:-2.4748616218566895 - response_length/mean:1047.0714111328125 - response_length/max:8192.0 - response_length/min:139.0 - response_length/clip_ratio:0.0059523810632526875 - response_length_non_aborted/mean:1047.0714111328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:139.0 - response_length_non_aborted/clip_ratio:0.0059523810632526875 - response/aborted_ratio:0.0 - prompt_length/mean:228.5833282470703 - prompt_length/max:404.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00012054014950990677 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1809284118935466) - timing_s/agent_loop/generate_sequences/max:np.float64(27.294308627024293) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.378208805369468) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.294308627024293) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.065463793464005 - timing_s/reward:0.0003216303884983063 - timing_s/old_log_prob:8.377256462350488 - timing_s/ref:18.795250460505486 - timing_s/adv:0.07836981862783432 - timing_s/update_actor:17.96815679129213 - timing_s/update_weights:30.334896181710064 - timing_s/step:105.01823266036808 - timing_s/stop_profile:6.77499920129776e-05 - timing_per_token_ms/adv:9.142109400848575e-05 - timing_per_token_ms/update_actor:0.020960474069446282 - timing_per_token_ms/gen:0.041307762855390326 - timing_per_token_ms/ref:0.021925307335758346 - perf/total_num_tokens:1480397 - perf/time_per_step:105.01823266036808 - perf/throughput:3524.1428142950317 - frontier/active_count:25.0 - frontier/completed_count:39.0 - frontier/blacklisted_count:1655.0 - frontier/mean_score:2.146172696377683 - frontier/mean_frontier_pct:0.6511208817214014 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:29.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:2.9989697308999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:1.2132458474299999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:3.7421987002967656 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:1.6628969248999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:1.5574480099999999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:2.7510382703264895 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:1.6464142999999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:1.4230895248999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.7536261869999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.5774480099999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.1921277308999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.8634480099999997 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.2634480099999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:1.6149816870406997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:1.2264142999999998 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.2217080099999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.3637389248999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.656636217629999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.097981271240699 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:1.9763542999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:2.1215080099999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:2.5570649871079425 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:2.638819493201 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:2.2936109516684895 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:31.0 - cluster/prob_snapshot/cluster_0:0.05589428541257038 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.022612268797897205 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.06974645994915246 - cluster/prob_snapshot/cluster_11:0.030992788748205442 - cluster/prob_snapshot/cluster_12:0.029027449890284513 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.05127338121428346 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.03068558840169429 - cluster/prob_snapshot/cluster_17:0.026523299402734825 - cluster/prob_snapshot/cluster_18:0.05132161436304876 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.048038035603569534 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.05949433121179264 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.03473062560427002 - cluster/prob_snapshot/cluster_32:0.023547928125867063 - cluster/prob_snapshot/cluster_33:0.03009975273222832 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.02285770016681222 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.04140781426862442 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.044054962191803576 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.049513931886541615 - cluster/prob_snapshot/cluster_44:0.03910181645273334 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.03683495374506807 - cluster/prob_snapshot/cluster_47:0.03954030378973113 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.04765814030574081 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.02311277190494585 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.04918186682096567 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04274792900943439 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▍         | 32/800 [55:23<23:40:48, 111.00s/it]
[36m(TaskRunner pid=2823680)[0m step:32 - global_seqlen/min:310699 - global_seqlen/max:416718 - global_seqlen/minmax_diff:106019 - global_seqlen/balanced_min:352261 - global_seqlen/balanced_max:352404 - global_seqlen/mean:352313.25 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.34389190098573996) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012500821612775326 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02830016225198051) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000187356434337765) - actor/ppo_kl:np.float64(-1.731639605924461e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.25367656214670703) - perf/mfu/actor:np.float64(0.2182661924808937) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(113.04988479614258) - actor/lr:np.float64(1e-06) - training/global_step:32 - training/epoch:0 - critic/score/mean:0.5426470637321472 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5354522466659546 - critic/rewards/max:1.0134360790252686 - critic/rewards/min:-0.05616135150194168 - critic/advantages/mean:-0.17432048916816711 - critic/advantages/max:2.474839687347412 - critic/advantages/min:-2.4748477935791016 - critic/returns/mean:-0.17432048916816711 - critic/returns/max:2.474839687347412 - critic/returns/min:-2.4748477935791016 - response_length/mean:979.2911987304688 - response_length/max:8192.0 - response_length/min:119.0 - response_length/clip_ratio:0.004411764908581972 - response_length_non_aborted/mean:979.2911987304688 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:119.0 - response_length_non_aborted/clip_ratio:0.004411764908581972 - response/aborted_ratio:0.0 - prompt_length/mean:225.3882293701172 - prompt_length/max:317.0 - prompt_length/min:169.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.816923946142197e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.325855916365981) - timing_s/agent_loop/generate_sequences/max:np.float64(27.683300766162574) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.602898997656666) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.683300766162574) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:226 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.869404831901193 - timing_s/reward:0.00015339255332946777 - timing_s/old_log_prob:9.143951554782689 - timing_s/ref:23.906664395704865 - timing_s/adv:0.06619234662503004 - timing_s/update_actor:18.868067184463143 - timing_s/update_weights:26.118420614860952 - timing_s/step:108.39237155392766 - timing_s/stop_profile:7.087644189596176e-05 - timing_per_token_ms/adv:8.080297983235721e-05 - timing_per_token_ms/update_actor:0.023032814666903257 - timing_per_token_ms/gen:0.04485447882757516 - timing_per_token_ms/ref:0.029183581176960512 - perf/total_num_tokens:1409253 - perf/time_per_step:108.39237155392766 - perf/throughput:3250.3509698070975 - frontier/active_count:22.0 - frontier/completed_count:42.0 - frontier/blacklisted_count:1698.0 - frontier/mean_score:2.223968740972371 - frontier/mean_frontier_pct:0.6692966336374027 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:31.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:2.9992788116299995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:3.5195390902077355 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:1.46402784743 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:1.390213607 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:2.8257267892285425 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:1.4524900099999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:1.4230895248999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:2.8275383308999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.7042136069999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.1921277308999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:1.8634480099999997 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.2634480099999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:1.6149816870406997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:1.2264142999999998 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:3.0551956069999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.5546172474299995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.1596453523409993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.097981271240699 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:1.9763542999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:2.1215080099999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:2.6899454909755596 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:2.5055276661679424 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:32.0 - cluster/prob_snapshot/cluster_0:0.0613007065083495 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.07193403693477746 - cluster/prob_snapshot/cluster_11:0.029922507053148365 - cluster/prob_snapshot/cluster_12:0.028413856016375603 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.057753566593277586 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.029686691168578062 - cluster/prob_snapshot/cluster_17:0.029085789878131266 - cluster/prob_snapshot/cluster_18:0.057790591755426245 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.055270021585123 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.06524224570596679 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.03808604892337357 - cluster/prob_snapshot/cluster_32:0.025822959622575668 - cluster/prob_snapshot/cluster_33:0.03300777441222225 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.025066046801125914 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.062443560933410754 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.05221249905765436 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.04413987302301805 - cluster/prob_snapshot/cluster_44:0.04287955280104367 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.04039368211819321 - cluster/prob_snapshot/cluster_47:0.04336040363164674 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.054978402950186585 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0512091825263954 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▍         | 33/800 [57:02<22:50:50, 107.24s/it]
[36m(TaskRunner pid=2823680)[0m step:33 - global_seqlen/min:310870 - global_seqlen/max:388100 - global_seqlen/minmax_diff:77230 - global_seqlen/balanced_min:365953 - global_seqlen/balanced_max:366136 - global_seqlen/mean:366060.0 - frontier/skipped_zero_acc_count:51.0 - actor/entropy:np.float64(0.3776455035385413) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011714519932866096 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.015294495417037979) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000228973951939574) - actor/ppo_kl:np.float64(6.06310109892897e-06) - actor/pg_clipfrac_lower:np.float64(3.842849018371616e-07) - actor/grad_norm:np.float64(0.25195412188768385) - perf/mfu/actor:np.float64(0.24843604933354851) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(113.31985855102539) - actor/lr:np.float64(1e-06) - training/global_step:33 - training/epoch:0 - critic/score/mean:0.5146104097366333 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5071258544921875 - critic/rewards/max:1.001463770866394 - critic/rewards/min:-0.05229533463716507 - critic/advantages/mean:-0.13671275973320007 - critic/advantages/max:2.4748497009277344 - critic/advantages/min:-2.474862575531006 - critic/returns/mean:-0.13671275973320007 - critic/returns/max:2.4748497009277344 - critic/returns/min:-2.474862575531006 - response_length/mean:1085.387939453125 - response_length/max:8192.0 - response_length/min:129.0 - response_length/clip_ratio:0.009740259498357773 - response_length_non_aborted/mean:1085.387939453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:129.0 - response_length_non_aborted/clip_ratio:0.009740259498357773 - response/aborted_ratio:0.0 - prompt_length/mean:232.66233825683594 - prompt_length/max:684.0 - prompt_length/min:169.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.58008861541748e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0162275414913893) - timing_s/agent_loop/generate_sequences/max:np.float64(28.362840864807367) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.318155474774358) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.362840864807367) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:219 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.983701632358134 - timing_s/reward:0.0001389235258102417 - timing_s/old_log_prob:9.01481347065419 - timing_s/ref:16.85401896853 - timing_s/adv:0.06460957415401936 - timing_s/update_actor:17.238507722504437 - timing_s/update_weights:23.75066805165261 - timing_s/step:98.27027699351311 - timing_s/stop_profile:5.88512048125267e-05 - timing_per_token_ms/adv:7.957637911419656e-05 - timing_per_token_ms/update_actor:0.021231807264646397 - timing_per_token_ms/gen:0.04634123238646503 - timing_per_token_ms/ref:0.020758251708027524 - perf/total_num_tokens:1464240 - perf/time_per_step:98.27027699351311 - perf/throughput:3725.032748449095 - frontier/active_count:22.0 - frontier/completed_count:42.0 - frontier/blacklisted_count:1749.0 - frontier/mean_score:2.2399660034772624 - frontier/mean_frontier_pct:0.7092076391966092 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:31.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:2.9994951681409994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:3.9636773631454147 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:1.9248194932009999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.2731495249 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.8780087524599796 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.316743007 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:1.2961626674299997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:2.8792768316299995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.1929495248999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.1921277308999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.2044136069999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.1844136069999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:1.6149816870406997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:1.7584900099999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:3.0386369248999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:2.0882320732009996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.411751746638699 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:1.7685868898684893 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:1.9763542999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:2.1215080099999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:2.6899454909755596 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:2.5055276661679424 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:33.0 - cluster/prob_snapshot/cluster_0:0.06086730300785946 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.08043298540717114 - cluster/prob_snapshot/cluster_11:0.03905942992423997 - cluster/prob_snapshot/cluster_12:0.02583540681428357 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.05840203800155564 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.026720028237358875 - cluster/prob_snapshot/cluster_17:0.026302401372039333 - cluster/prob_snapshot/cluster_18:0.05842777051811355 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.044500462821467426 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.06477630232586783 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.044733097888294396 - cluster/prob_snapshot/cluster_32:0.02403473180074544 - cluster/prob_snapshot/cluster_33:0.0327720413559371 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.035684186263017295 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.06166158772423944 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.04237548228125063 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.048940510357072003 - cluster/prob_snapshot/cluster_44:0.03588907735699769 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.04010520080402115 - cluster/prob_snapshot/cluster_47:0.043050734753575966 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.0545857613017292 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.05084345968316299 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▍         | 34/800 [58:45<22:33:50, 106.04s/it]
[36m(TaskRunner pid=2823680)[0m step:34 - global_seqlen/min:312456 - global_seqlen/max:357947 - global_seqlen/minmax_diff:45491 - global_seqlen/balanced_min:332896 - global_seqlen/balanced_max:333114 - global_seqlen/mean:333014.0 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.3295295106490021) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013065842911601067 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.09732126504241023) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00018885935348763545) - actor/ppo_kl:np.float64(6.973175488442591e-05) - actor/pg_clipfrac_lower:np.float64(3.8351442091642757e-07) - actor/grad_norm:np.float64(0.25160782411694527) - perf/mfu/actor:np.float64(0.19967707912736646) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(113.86653137207031) - actor/lr:np.float64(1e-06) - training/global_step:34 - training/epoch:0 - critic/score/mean:0.5206043720245361 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5132293105125427 - critic/rewards/max:1.0008680820465088 - critic/rewards/min:-0.06716147810220718 - critic/advantages/mean:-0.13797573745250702 - critic/advantages/max:2.4748432636260986 - critic/advantages/min:-2.474857807159424 - critic/returns/mean:-0.13797573745250702 - critic/returns/max:2.4748432636260986 - critic/returns/min:-2.474857807159424 - response_length/mean:975.6085205078125 - response_length/max:8192.0 - response_length/min:122.0 - response_length/clip_ratio:0.006868131924420595 - response_length_non_aborted/mean:975.6085205078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:122.0 - response_length_non_aborted/clip_ratio:0.006868131924420595 - response/aborted_ratio:0.0 - prompt_length/mean:239.24176025390625 - prompt_length/max:804.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.344929665327072e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0164009742438793) - timing_s/agent_loop/generate_sequences/max:np.float64(27.105829485692084) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.477164714124228) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.105829485692084) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.76286698691547 - timing_s/reward:0.00011883024126291275 - timing_s/old_log_prob:9.077701672911644 - timing_s/ref:18.733964609913528 - timing_s/adv:0.07164242211729288 - timing_s/update_actor:19.507991435937583 - timing_s/update_weights:25.48594663385302 - timing_s/step:103.03943824023008 - timing_s/stop_profile:5.71003183722496e-05 - timing_per_token_ms/adv:8.10058017339143e-05 - timing_per_token_ms/update_actor:0.02205760832456582 - timing_per_token_ms/gen:0.04190518876907688 - timing_per_token_ms/ref:0.021182419271032957 - perf/total_num_tokens:1332056 - perf/time_per_step:103.03943824023008 - perf/throughput:3231.908147864689 - frontier/active_count:22.0 - frontier/completed_count:42.0 - frontier/blacklisted_count:1786.0 - frontier/mean_score:2.233632055964618 - frontier/mean_frontier_pct:0.7547076579475813 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:31.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:2.9996466176986996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.27457415420179 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:1.6473736452407 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.7912046674299997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:2.3146061267219853 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.2217201049 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:1.2961626674299997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9154937821409996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.4350646674299994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.1921277308999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.4430895248999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.1844136069999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:1.6149816870406997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:1.7584900099999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:3.0270458474299993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:2.3617624512406996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:1.9882262226470893 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:1.7685868898684893 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:1.6834480099999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:1.7850556069999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:2.782961843682892 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:2.6538693663175597 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:34.0 - cluster/prob_snapshot/cluster_0:0.061042987437282245 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.08698783878577551 - cluster/prob_snapshot/cluster_11:0.0335241518576235 - cluster/prob_snapshot/cluster_12:0.03645112173093769 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.047102372620193296 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02486207694441719 - cluster/prob_snapshot/cluster_17:0.026376987528385958 - cluster/prob_snapshot/cluster_18:0.059330472218505785 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.04955371110245556 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.06495998956204427 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.04971701743021993 - cluster/prob_snapshot/cluster_32:0.024102887488831975 - cluster/prob_snapshot/cluster_33:0.03286497357782048 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03578537650257256 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.06160056339520028 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.04806200663448019 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.040460522121313486 - cluster/prob_snapshot/cluster_44:0.03599084860963057 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.03425826733036518 - cluster/prob_snapshot/cluster_47:0.03632598798472742 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.0566334393725022 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.05400639976471493 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▍         | 35/800 [1:00:33<22:38:37, 106.56s/it]
[36m(TaskRunner pid=2823680)[0m step:35 - global_seqlen/min:294489 - global_seqlen/max:451867 - global_seqlen/minmax_diff:157378 - global_seqlen/balanced_min:379203 - global_seqlen/balanced_max:379275 - global_seqlen/mean:379238.5 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.3413902656598525) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011290665715932846 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.005819380814500619) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00017818791469504834) - actor/ppo_kl:np.float64(2.158321267180115e-05) - actor/pg_clipfrac_lower:np.float64(1.5834470807683167e-06) - actor/grad_norm:np.float64(0.24911610240286047) - perf/mfu/actor:np.float64(0.2318886092055191) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(114.19895553588867) - actor/lr:np.float64(1e-06) - training/global_step:35 - training/epoch:0 - critic/score/mean:0.49147728085517883 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.48420464992523193 - critic/rewards/max:1.0041002035140991 - critic/rewards/min:-0.039311617612838745 - critic/advantages/mean:-0.17495033144950867 - critic/advantages/max:2.4748523235321045 - critic/advantages/min:-2.4748570919036865 - critic/returns/mean:-0.17495033144950867 - critic/returns/max:2.4748523235321045 - critic/returns/min:-2.4748570919036865 - response_length/mean:1061.680419921875 - response_length/max:8192.0 - response_length/min:119.0 - response_length/clip_ratio:0.007102272938936949 - response_length_non_aborted/mean:1061.680419921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:119.0 - response_length_non_aborted/clip_ratio:0.007102272938936949 - response/aborted_ratio:0.0 - prompt_length/mean:228.61363220214844 - prompt_length/max:404.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.0001079430803656578 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0118534481152892) - timing_s/agent_loop/generate_sequences/max:np.float64(28.51733463536948) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.459088295137917) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.51733463536948) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:184 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.827715539373457 - timing_s/reward:0.00017234310507774353 - timing_s/old_log_prob:9.325593560934067 - timing_s/ref:20.09380200598389 - timing_s/adv:0.06085981521755457 - timing_s/update_actor:19.286116126924753 - timing_s/update_weights:27.575963803566992 - timing_s/step:107.55572652537376 - timing_s/stop_profile:5.033425986766815e-05 - timing_per_token_ms/adv:6.699914816099062e-05 - timing_per_token_ms/update_actor:0.021231634490161744 - timing_per_token_ms/gen:0.04124533970639579 - timing_per_token_ms/ref:0.022120796997231174 - perf/total_num_tokens:1516954 - perf/time_per_step:107.55572652537376 - perf/throughput:3525.972184386973 - frontier/active_count:19.0 - frontier/completed_count:45.0 - frontier/blacklisted_count:1826.0 - frontier/mean_score:2.005676808681352 - frontier/mean_frontier_pct:0.7720159631600736 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:31.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:2.9996466176986996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:160.0 - frontier/cluster_11/score:1.45316155166849 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:1.5538432672009999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:1.9202242887053897 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.2217201049 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:160.0 - frontier/cluster_17/score:1.2073138672009998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9408456474986995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:2.6045452672009994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.1844136069999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:1.6149816870406997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:2.1309430069999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.4189320932009992 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:2.5532337158684895 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:1.6917583558529625 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:1.5380108229079426 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:1.6834480099999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:1.7850556069999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:2.848073290578024 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:2.7577085564222914 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:35.0 - cluster/prob_snapshot/cluster_0:0.07871464489705732 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03813285699813439 - cluster/prob_snapshot/cluster_12:0.04077487670772542 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.050389193219069886 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.03205953116390014 - cluster/prob_snapshot/cluster_17:0.031681492671602876 - cluster/prob_snapshot/cluster_18:0.07717163064278279 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.068346669443124 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.031080559935347814 - cluster/prob_snapshot/cluster_33:0.04237922869334074 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.055918727593505094 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.06347593733974218 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.06700018732138849 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.044393948756123576 - cluster/prob_snapshot/cluster_44:0.04035941269172191 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.04417587442732496 - cluster/prob_snapshot/cluster_47:0.04684219047585813 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.07473716283307912 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.07236587418917127 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 12:32:58,951:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   4%|▍         | 36/800 [1:02:18<22:30:28, 106.06s/it]
[36m(TaskRunner pid=2823680)[0m step:36 - global_seqlen/min:332045 - global_seqlen/max:481735 - global_seqlen/minmax_diff:149690 - global_seqlen/balanced_min:398202 - global_seqlen/balanced_max:398351 - global_seqlen/mean:398273.0 - frontier/skipped_zero_acc_count:46.0 - actor/entropy:np.float64(0.30224950202718015) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009957753121852875 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.00104133444983745) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00020546605806086472) - actor/ppo_kl:np.float64(-1.952474447329578e-05) - actor/pg_clipfrac_lower:np.float64(2.50722081457242e-06) - actor/grad_norm:np.float64(0.220584975047545) - perf/mfu/actor:np.float64(0.27618814118264146) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(114.11122512817383) - actor/lr:np.float64(1e-06) - training/global_step:36 - training/epoch:0 - critic/score/mean:0.48170730471611023 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.47412240505218506 - critic/rewards/max:1.0012587308883667 - critic/rewards/min:-0.059730544686317444 - critic/advantages/mean:-0.12320037931203842 - critic/advantages/max:2.474855661392212 - critic/advantages/min:-2.474818468093872 - critic/returns/mean:-0.12320037931203842 - critic/returns/max:2.474855661392212 - critic/returns/min:-2.474818468093872 - response_length/mean:1182.138671875 - response_length/max:8192.0 - response_length/min:130.0 - response_length/clip_ratio:0.0030487803742289543 - response_length_non_aborted/mean:1182.138671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:130.0 - response_length_non_aborted/clip_ratio:0.0030487803742289543 - response/aborted_ratio:0.0 - prompt_length/mean:237.64634704589844 - prompt_length/max:382.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.616875857114792e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0546743739396334) - timing_s/agent_loop/generate_sequences/max:np.float64(29.006827857345343) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.9408052208818845) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.006827857345343) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:228 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.00629714410752 - timing_s/reward:0.00018015876412391663 - timing_s/old_log_prob:9.025006563402712 - timing_s/ref:20.03187674842775 - timing_s/adv:0.05956780258566141 - timing_s/update_actor:16.988980371505022 - timing_s/update_weights:27.069515412673354 - timing_s/step:104.63312563952059 - timing_s/stop_profile:0.0001397356390953064 - timing_per_token_ms/adv:6.395656610859962e-05 - timing_per_token_ms/update_actor:0.018240673637160622 - timing_per_token_ms/gen:0.039983206780944934 - timing_per_token_ms/ref:0.021507760802452865 - perf/total_num_tokens:1593092 - perf/time_per_step:104.63312563952059 - perf/throughput:3806.3758256837336 - frontier/active_count:15.0 - frontier/completed_count:49.0 - frontier/blacklisted_count:1872.0 - frontier/mean_score:1.9841254699782946 - frontier/mean_frontier_pct:0.7936512893049856 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:32.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:2.9997526323890895 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:1.317213086167943 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:1.9876902870406998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.2217201049 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:160.0 - frontier/cluster_17/score:1.2073138672009998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:2.9585919532490896 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:1.6149816870406997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:2.3916601048999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:1.9932524652406993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:2.0872636011079426 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.0842308490970733 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:1.3766075760355598 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:2.078413607 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:2.1495389249 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:2.2936513034046166 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:36.0 - cluster/prob_snapshot/cluster_0:0.10079176536558801 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.044258393470192274 - cluster/prob_snapshot/cluster_12:0.06678644461136973 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.04104982685103293 - cluster/prob_snapshot/cluster_17:0.040565776894953026 - cluster/prob_snapshot/cluster_18:0.09940876549107401 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.05426342609466685 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.08035984084972084 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.06697333394151075 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.07013211051721026 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0700302099718816 - cluster/prob_snapshot/cluster_44:0.04625404985269133 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.06983475048825875 - cluster/prob_snapshot/cluster_47:0.07222456299343862 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.0770667426064108 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 12:34:38,799:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   5%|▍         | 37/800 [1:04:02<22:23:38, 105.66s/it]
[36m(TaskRunner pid=2823680)[0m step:37 - global_seqlen/min:316114 - global_seqlen/max:419836 - global_seqlen/minmax_diff:103722 - global_seqlen/balanced_min:364747 - global_seqlen/balanced_max:364859 - global_seqlen/mean:364819.75 - frontier/skipped_zero_acc_count:41.0 - actor/entropy:np.float64(0.3228762165050615) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01179598830640316 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0559637717960868) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021729987706335535) - actor/ppo_kl:np.float64(-7.735820396057782e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.24681061912666669) - perf/mfu/actor:np.float64(0.2195314440221381) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(114.22320938110352) - actor/lr:np.float64(1e-06) - training/global_step:37 - training/epoch:0 - critic/score/mean:0.5502873659133911 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5426984429359436 - critic/rewards/max:1.0034706592559814 - critic/rewards/min:-0.034867141395807266 - critic/advantages/mean:-0.2637586295604706 - critic/advantages/max:2.474851369857788 - critic/advantages/min:-2.474843978881836 - critic/returns/mean:-0.2637586295604706 - critic/returns/max:2.474851369857788 - critic/returns/min:-2.474843978881836 - response_length/mean:1065.313232421875 - response_length/max:8192.0 - response_length/min:80.0 - response_length/clip_ratio:0.010057471692562103 - response_length_non_aborted/mean:1065.313232421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:80.0 - response_length_non_aborted/clip_ratio:0.010057471692562103 - response/aborted_ratio:0.0 - prompt_length/mean:240.44827270507812 - prompt_length/max:962.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.878298103809357e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.7641800437122583) - timing_s/agent_loop/generate_sequences/max:np.float64(27.967914004810154) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.1603005392389605) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.967914004810154) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:210 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.545516662299633 - timing_s/reward:0.00022337865084409714 - timing_s/old_log_prob:8.838151236064732 - timing_s/ref:19.288252249360085 - timing_s/adv:0.09283739980310202 - timing_s/update_actor:19.534622250124812 - timing_s/update_weights:26.792448515072465 - timing_s/step:104.50524803251028 - timing_s/stop_profile:6.510410457849503e-05 - timing_per_token_ms/adv:0.00010215270496924771 - timing_per_token_ms/update_actor:0.021494726345578075 - timing_per_token_ms/gen:0.03984786280854699 - timing_per_token_ms/ref:0.0212236355776896 - perf/total_num_tokens:1459279 - perf/time_per_step:104.50524803251028 - perf/throughput:3490.9227705627677 - frontier/active_count:11.0 - frontier/completed_count:53.0 - frontier/blacklisted_count:1911.0 - frontier/mean_score:2.206458636279361 - frontier/mean_frontier_pct:0.7944972233344578 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:33.0 - frontier/replay_slots_count:8.0 - frontier/replay_pool_size:1546.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:1.69138320092849 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.7552040734299998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:2.9710143672743627 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:2.03048718092849 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:2.5741620734299997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:1.7610845207755599 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.358961594367951 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:1.8636253032248917 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.3548895249 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:2.4046772474299996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:2.5055559123832314 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:37.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.06968728379816741 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0723167903770538 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.12240982485046854 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.08365882808119898 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.1060589716482467 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0725590727899365 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.09719241979313423 - cluster/prob_snapshot/cluster_44:0.07678389221791121 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.09702464500354001 - cluster/prob_snapshot/cluster_47:0.09907596675470075 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.10323230468564212 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   5%|▍         | 38/800 [1:05:56<22:50:32, 107.92s/it]
[36m(TaskRunner pid=2823680)[0m step:38 - global_seqlen/min:319619 - global_seqlen/max:434729 - global_seqlen/minmax_diff:115110 - global_seqlen/balanced_min:372915 - global_seqlen/balanced_max:373005 - global_seqlen/mean:372961.75 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.31115160323679447) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010916761122643948 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0578232710249722) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000220429562200189) - actor/ppo_kl:np.float64(1.0923700905153809e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23502514138817787) - perf/mfu/actor:np.float64(0.20626977813071967) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(113.95749282836914) - actor/lr:np.float64(1e-06) - training/global_step:38 - training/epoch:0 - critic/score/mean:0.5147849321365356 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.506918728351593 - critic/rewards/max:1.0013636350631714 - critic/rewards/min:-0.05335532873868942 - critic/advantages/mean:-0.17011578381061554 - critic/advantages/max:2.4748375415802 - critic/advantages/min:-2.4748568534851074 - critic/returns/mean:-0.17011578381061554 - critic/returns/max:2.4748375415802 - critic/returns/min:-2.4748568534851074 - response_length/mean:1169.768798828125 - response_length/max:8192.0 - response_length/min:120.0 - response_length/clip_ratio:0.01344086043536663 - response_length_non_aborted/mean:1169.768798828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:120.0 - response_length_non_aborted/clip_ratio:0.01344086043536663 - response/aborted_ratio:0.0 - prompt_length/mean:231.50537109375 - prompt_length/max:556.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.749682456254959e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1602584039792418) - timing_s/agent_loop/generate_sequences/max:np.float64(28.691296176053584) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.507435237605023) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.691296176053584) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.58697268459946 - timing_s/reward:0.00011184997856616974 - timing_s/old_log_prob:11.376223389059305 - timing_s/ref:22.05298865865916 - timing_s/adv:0.0775076812133193 - timing_s/update_actor:21.22603544872254 - timing_s/update_weights:27.209597934037447 - timing_s/step:112.92691918089986 - timing_s/stop_profile:5.3600408136844635e-05 - timing_per_token_ms/adv:7.43444725934147e-05 - timing_per_token_ms/update_actor:0.020359768038231852 - timing_per_token_ms/gen:0.035144997730228215 - timing_per_token_ms/ref:0.02115297200575816 - perf/total_num_tokens:1491847 - perf/time_per_step:112.92691918089986 - perf/throughput:3302.6824135930356 - frontier/active_count:8.0 - frontier/completed_count:56.0 - frontier/blacklisted_count:1946.0 - frontier/mean_score:2.3788866994876026 - frontier/mean_frontier_pct:0.7871842520940745 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:33.0 - frontier/replay_slots_count:40.0 - frontier/replay_pool_size:1888.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:1.4839682406499428 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:2.1286428514009996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:2.9797100570920536 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:2.7019134514009995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:1.9512731160575658 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.5484226674299997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:2.5832740732009993 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:2.6538891386682617 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:38.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.07797598352254335 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.11185078990203148 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.1565706165059198 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.1419736305633521 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.10253079289557264 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.13390836709346615 - cluster/prob_snapshot/cluster_47:0.13573965469632393 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.1394501648207906 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   5%|▍         | 39/800 [1:07:47<23:02:22, 108.99s/it]
[36m(TaskRunner pid=2823680)[0m step:39 - global_seqlen/min:323352 - global_seqlen/max:405561 - global_seqlen/minmax_diff:82209 - global_seqlen/balanced_min:375531 - global_seqlen/balanced_max:375606 - global_seqlen/mean:375577.0 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.2871981231868267) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01088341511785984 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.062017721196298226) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00020543311191431712) - actor/ppo_kl:np.float64(4.156026202281282e-05) - actor/pg_clipfrac_lower:np.float64(2.338634221814573e-06) - actor/grad_norm:np.float64(0.22143977192732003) - perf/mfu/actor:np.float64(0.1938895929288566) - perf/max_memory_allocated_gb:np.float64(73.02605247497559) - perf/max_memory_reserved_gb:np.float64(79.0859375) - perf/cpu_memory_used_gb:np.float64(114.25849533081055) - actor/lr:np.float64(1e-06) - training/global_step:39 - training/epoch:0 - critic/score/mean:0.53125 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5236228704452515 - critic/rewards/max:1.0011063814163208 - critic/rewards/min:-0.045288801193237305 - critic/advantages/mean:-0.2564623951911926 - critic/advantages/max:2.4748482704162598 - critic/advantages/min:-2.4748568534851074 - critic/returns/mean:-0.2564623951911926 - critic/returns/max:2.4748482704162598 - critic/returns/min:-2.4748568534851074 - response_length/mean:1121.4962158203125 - response_length/max:8192.0 - response_length/min:139.0 - response_length/clip_ratio:0.012500000186264515 - response_length_non_aborted/mean:1121.4962158203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:139.0 - response_length_non_aborted/clip_ratio:0.012500000186264515 - response/aborted_ratio:0.0 - prompt_length/mean:234.2100067138672 - prompt_length/max:452.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.680671453475952e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0302007785066962) - timing_s/agent_loop/generate_sequences/max:np.float64(28.61956003960222) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.524492768519849) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.61956003960222) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.303290338255465 - timing_s/reward:0.00012657884508371353 - timing_s/old_log_prob:10.206649737432599 - timing_s/ref:20.46244214475155 - timing_s/adv:0.08304117992520332 - timing_s/update_actor:22.77971330936998 - timing_s/update_weights:26.998621061444283 - timing_s/step:111.23522036802024 - timing_s/stop_profile:5.602557212114334e-05 - timing_per_token_ms/adv:7.656634680743277e-05 - timing_per_token_ms/update_actor:0.021003548251483296 - timing_per_token_ms/gen:0.03377551456174671 - timing_per_token_ms/ref:0.018866957853841446 - perf/total_num_tokens:1502308 - perf/time_per_step:111.23522036802024 - perf/throughput:3376.4215934252525 - frontier/active_count:7.0 - frontier/completed_count:57.0 - frontier/blacklisted_count:1974.0 - frontier/mean_score:2.418816503021411 - frontier/mean_frontier_pct:0.8125687146249483 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:5.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:33.0 - frontier/replay_slots_count:64.0 - frontier/replay_pool_size:2282.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:1.9387777684549599 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:2.9857970399644373 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:2.7913394159806995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:1.6658911812402961 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:2.0838958672009995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:2.7082918512406993 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:2.757722397067783 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:39.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.11450568998948622 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.1763434447167976 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.16485862950472796 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.09838880054176348 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.12307647530446324 - cluster/prob_snapshot/cluster_47:0.15995377715020648 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.16287318279255492 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   5%|▌         | 40/800 [1:09:42<23:21:54, 110.68s/it]
[36m(TaskRunner pid=2823680)[0m step:40 - global_seqlen/min:377430 - global_seqlen/max:489483 - global_seqlen/minmax_diff:112053 - global_seqlen/balanced_min:427115 - global_seqlen/balanced_max:427246 - global_seqlen/mean:427169.25 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.36853436068740003) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.007922407239675522 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.059907161514274776) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002557581227064961) - actor/ppo_kl:np.float64(-9.683551297051286e-06) - actor/pg_clipfrac_lower:np.float64(2.1953505364403694e-06) - actor/grad_norm:np.float64(0.23015321791172028) - perf/mfu/actor:np.float64(0.26009865305414703) - perf/max_memory_allocated_gb:np.float64(75.54081106185913) - perf/max_memory_reserved_gb:np.float64(81.859375) - perf/cpu_memory_used_gb:np.float64(114.70840072631836) - actor/lr:np.float64(1e-06) - training/global_step:40 - training/epoch:0 - critic/score/mean:0.42941176891326904 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4219311475753784 - critic/rewards/max:1.0048307180404663 - critic/rewards/min:-0.040504612028598785 - critic/advantages/mean:-0.15952208638191223 - critic/advantages/max:2.4748497009277344 - critic/advantages/min:-2.4748613834381104 - critic/returns/mean:-0.15952208638191223 - critic/returns/max:2.4748497009277344 - critic/returns/min:-2.4748613834381104 - response_length/mean:1337.9632568359375 - response_length/max:8192.0 - response_length/min:175.0 - response_length/clip_ratio:0.007352941203862429 - response_length_non_aborted/mean:1337.9632568359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:175.0 - response_length_non_aborted/clip_ratio:0.007352941203862429 - response/aborted_ratio:0.0 - prompt_length/mean:248.0117645263672 - prompt_length/max:667.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.448492735624313e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3569276379421353) - timing_s/agent_loop/generate_sequences/max:np.float64(29.32781553734094) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.75719097372712) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.32781553734094) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:220 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.164497217163444 - timing_s/reward:0.0002380330115556717 - timing_s/old_log_prob:9.865481211803854 - timing_s/ref:23.03356970101595 - timing_s/adv:0.05809784680604935 - timing_s/update_actor:19.502490431070328 - timing_s/update_weights:29.72320107743144 - timing_s/step:113.74824312329292 - timing_s/stop_profile:6.549432873725891e-05 - timing_per_token_ms/adv:5.387096896791948e-05 - timing_per_token_ms/update_actor:0.018083597148043397 - timing_per_token_ms/gen:0.034253663895586955 - timing_per_token_ms/ref:0.021357774630206088 - perf/total_num_tokens:1708677 - perf/time_per_step:113.74824312329292 - perf/throughput:3755.3920682272583 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:43.0 - frontier/mean_score:1.9906249999999999 - frontier/mean_frontier_pct:0.019858022692650888 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:1.7 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:40.0 - cluster/prob_snapshot/cluster_0:0.015698587127158558 - cluster/prob_snapshot/cluster_1:0.015698587127158558 - cluster/prob_snapshot/cluster_2:0.015698587127158558 - cluster/prob_snapshot/cluster_3:0.015698587127158558 - cluster/prob_snapshot/cluster_4:0.015698587127158558 - cluster/prob_snapshot/cluster_5:0.01805337519623234 - cluster/prob_snapshot/cluster_6:0.015698587127158558 - cluster/prob_snapshot/cluster_7:0.015698587127158558 - cluster/prob_snapshot/cluster_8:0.015698587127158558 - cluster/prob_snapshot/cluster_9:0.015698587127158558 - cluster/prob_snapshot/cluster_10:0.015698587127158558 - cluster/prob_snapshot/cluster_11:0.015698587127158558 - cluster/prob_snapshot/cluster_12:0.013343799058084773 - cluster/prob_snapshot/cluster_13:0.01805337519623234 - cluster/prob_snapshot/cluster_14:0.013343799058084773 - cluster/prob_snapshot/cluster_15:0.01805337519623234 - cluster/prob_snapshot/cluster_16:0.01805337519623234 - cluster/prob_snapshot/cluster_17:0.015698587127158558 - cluster/prob_snapshot/cluster_18:0.015698587127158558 - cluster/prob_snapshot/cluster_19:0.015698587127158558 - cluster/prob_snapshot/cluster_20:0.015698587127158558 - cluster/prob_snapshot/cluster_21:0.015698587127158558 - cluster/prob_snapshot/cluster_22:0.015698587127158558 - cluster/prob_snapshot/cluster_23:0.015698587127158558 - cluster/prob_snapshot/cluster_24:0.015698587127158558 - cluster/prob_snapshot/cluster_25:0.015698587127158558 - cluster/prob_snapshot/cluster_26:0.015698587127158558 - cluster/prob_snapshot/cluster_27:0.015698587127158558 - cluster/prob_snapshot/cluster_28:0.015698587127158558 - cluster/prob_snapshot/cluster_29:0.013343799058084773 - cluster/prob_snapshot/cluster_30:0.015698587127158558 - cluster/prob_snapshot/cluster_31:0.015698587127158558 - cluster/prob_snapshot/cluster_32:0.015698587127158558 - cluster/prob_snapshot/cluster_33:0.015698587127158558 - cluster/prob_snapshot/cluster_34:0.015698587127158558 - cluster/prob_snapshot/cluster_35:0.015698587127158558 - cluster/prob_snapshot/cluster_36:0.015698587127158558 - cluster/prob_snapshot/cluster_37:0.015698587127158558 - cluster/prob_snapshot/cluster_38:0.013343799058084773 - cluster/prob_snapshot/cluster_39:0.015698587127158558 - cluster/prob_snapshot/cluster_40:0.015698587127158558 - cluster/prob_snapshot/cluster_41:0.01805337519623234 - cluster/prob_snapshot/cluster_42:0.015698587127158558 - cluster/prob_snapshot/cluster_43:0.01805337519623234 - cluster/prob_snapshot/cluster_44:0.013343799058084773 - cluster/prob_snapshot/cluster_45:0.015698587127158558 - cluster/prob_snapshot/cluster_46:0.01805337519623234 - cluster/prob_snapshot/cluster_47:0.015698587127158558 - cluster/prob_snapshot/cluster_48:0.015698587127158558 - cluster/prob_snapshot/cluster_49:0.013343799058084773 - cluster/prob_snapshot/cluster_50:0.013343799058084773 - cluster/prob_snapshot/cluster_51:0.015698587127158558 - cluster/prob_snapshot/cluster_52:0.015698587127158558 - cluster/prob_snapshot/cluster_53:0.013343799058084773 - cluster/prob_snapshot/cluster_54:0.015698587127158558 - cluster/prob_snapshot/cluster_55:0.015698587127158558 - cluster/prob_snapshot/cluster_56:0.015698587127158558 - cluster/prob_snapshot/cluster_57:0.015698587127158558 - cluster/prob_snapshot/cluster_58:0.015698587127158558 - cluster/prob_snapshot/cluster_59:0.015698587127158558 - cluster/prob_snapshot/cluster_60:0.013343799058084773 - cluster/prob_snapshot/cluster_61:0.015698587127158558 - cluster/prob_snapshot/cluster_62:0.015698587127158558 - cluster/prob_snapshot/cluster_63:0.015698587127158558
[36m(TaskRunner pid=2823680)[0m Training Progress:   5%|▌         | 41/800 [1:11:33<23:23:49, 110.97s/it]
[36m(TaskRunner pid=2823680)[0m step:41 - global_seqlen/min:368137 - global_seqlen/max:422241 - global_seqlen/minmax_diff:54104 - global_seqlen/balanced_min:390163 - global_seqlen/balanced_max:390272 - global_seqlen/mean:390206.75 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.333116708861457) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008611251600086689 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.029248030565213412) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00019164406310462963) - actor/ppo_kl:np.float64(-1.8884810484741543e-05) - actor/pg_clipfrac_lower:np.float64(2.732553993054252e-07) - actor/grad_norm:np.float64(0.22208633460104465) - perf/mfu/actor:np.float64(0.23146842469313372) - perf/max_memory_allocated_gb:np.float64(75.54081106185913) - perf/max_memory_reserved_gb:np.float64(81.859375) - perf/cpu_memory_used_gb:np.float64(114.97455596923828) - actor/lr:np.float64(1e-06) - training/global_step:41 - training/epoch:0 - critic/score/mean:0.4972222149372101 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.48977431654930115 - critic/rewards/max:1.0022889375686646 - critic/rewards/min:-0.05909040570259094 - critic/advantages/mean:-0.2281782329082489 - critic/advantages/max:2.474843978881836 - critic/advantages/min:-2.474863290786743 - critic/returns/mean:-0.2281782329082489 - critic/returns/max:2.474843978881836 - critic/returns/min:-2.474863290786743 - response_length/mean:1217.71533203125 - response_length/max:8192.0 - response_length/min:146.0 - response_length/clip_ratio:0.013888888992369175 - response_length_non_aborted/mean:1217.71533203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:146.0 - response_length_non_aborted/clip_ratio:0.013888888992369175 - response/aborted_ratio:0.0 - prompt_length/mean:234.12222290039062 - prompt_length/max:474.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.954666554927826e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2044911235570908) - timing_s/agent_loop/generate_sequences/max:np.float64(29.454453280195594) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.922916066390826) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.454453280195594) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:198 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.40440760180354 - timing_s/reward:0.0001242421567440033 - timing_s/old_log_prob:9.98925554100424 - timing_s/ref:21.404071314260364 - timing_s/adv:0.0723738158121705 - timing_s/update_actor:19.766595693305135 - timing_s/update_weights:27.904530736617744 - timing_s/step:110.9179795961827 - timing_s/stop_profile:4.978105425834656e-05 - timing_per_token_ms/adv:6.923583984296768e-05 - timing_per_token_ms/update_actor:0.01890955780491306 - timing_per_token_ms/gen:0.035818909047343375 - timing_per_token_ms/ref:0.020476035937466566 - perf/total_num_tokens:1560827 - perf/time_per_step:110.9179795961827 - perf/throughput:3517.975637679476 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:81.0 - frontier/mean_score:1.9999999999999998 - frontier/mean_frontier_pct:0.03771764491918013 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.09 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:1.7 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.7 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:1.7 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:41.0 - cluster/prob_snapshot/cluster_0:0.015625000000000003 - cluster/prob_snapshot/cluster_1:0.015625000000000003 - cluster/prob_snapshot/cluster_2:0.015625000000000003 - cluster/prob_snapshot/cluster_3:0.015625000000000003 - cluster/prob_snapshot/cluster_4:0.015625000000000003 - cluster/prob_snapshot/cluster_5:0.017968750000000002 - cluster/prob_snapshot/cluster_6:0.013281250000000001 - cluster/prob_snapshot/cluster_7:0.015625000000000003 - cluster/prob_snapshot/cluster_8:0.015625000000000003 - cluster/prob_snapshot/cluster_9:0.013281250000000001 - cluster/prob_snapshot/cluster_10:0.015625000000000003 - cluster/prob_snapshot/cluster_11:0.017968750000000002 - cluster/prob_snapshot/cluster_12:0.016328125000000002 - cluster/prob_snapshot/cluster_13:0.019609375000000002 - cluster/prob_snapshot/cluster_14:0.013281250000000001 - cluster/prob_snapshot/cluster_15:0.019609375000000002 - cluster/prob_snapshot/cluster_16:0.017968750000000002 - cluster/prob_snapshot/cluster_17:0.013281250000000001 - cluster/prob_snapshot/cluster_18:0.015625000000000003 - cluster/prob_snapshot/cluster_19:0.015625000000000003 - cluster/prob_snapshot/cluster_20:0.013281250000000001 - cluster/prob_snapshot/cluster_21:0.017968750000000002 - cluster/prob_snapshot/cluster_22:0.015625000000000003 - cluster/prob_snapshot/cluster_23:0.015625000000000003 - cluster/prob_snapshot/cluster_24:0.015625000000000003 - cluster/prob_snapshot/cluster_25:0.015625000000000003 - cluster/prob_snapshot/cluster_26:0.015625000000000003 - cluster/prob_snapshot/cluster_27:0.017968750000000002 - cluster/prob_snapshot/cluster_28:0.015625000000000003 - cluster/prob_snapshot/cluster_29:0.011640625000000002 - cluster/prob_snapshot/cluster_30:0.015625000000000003 - cluster/prob_snapshot/cluster_31:0.015625000000000003 - cluster/prob_snapshot/cluster_32:0.015625000000000003 - cluster/prob_snapshot/cluster_33:0.015625000000000003 - cluster/prob_snapshot/cluster_34:0.013281250000000001 - cluster/prob_snapshot/cluster_35:0.015625000000000003 - cluster/prob_snapshot/cluster_36:0.017968750000000002 - cluster/prob_snapshot/cluster_37:0.015625000000000003 - cluster/prob_snapshot/cluster_38:0.013281250000000001 - cluster/prob_snapshot/cluster_39:0.015625000000000003 - cluster/prob_snapshot/cluster_40:0.015625000000000003 - cluster/prob_snapshot/cluster_41:0.017968750000000002 - cluster/prob_snapshot/cluster_42:0.013281250000000001 - cluster/prob_snapshot/cluster_43:0.017968750000000002 - cluster/prob_snapshot/cluster_44:0.013281250000000001 - cluster/prob_snapshot/cluster_45:0.015625000000000003 - cluster/prob_snapshot/cluster_46:0.017968750000000002 - cluster/prob_snapshot/cluster_47:0.015625000000000003 - cluster/prob_snapshot/cluster_48:0.015625000000000003 - cluster/prob_snapshot/cluster_49:0.013281250000000001 - cluster/prob_snapshot/cluster_50:0.013281250000000001 - cluster/prob_snapshot/cluster_51:0.015625000000000003 - cluster/prob_snapshot/cluster_52:0.015625000000000003 - cluster/prob_snapshot/cluster_53:0.013281250000000001 - cluster/prob_snapshot/cluster_54:0.015625000000000003 - cluster/prob_snapshot/cluster_55:0.015625000000000003 - cluster/prob_snapshot/cluster_56:0.015625000000000003 - cluster/prob_snapshot/cluster_57:0.015625000000000003 - cluster/prob_snapshot/cluster_58:0.017968750000000002 - cluster/prob_snapshot/cluster_59:0.015625000000000003 - cluster/prob_snapshot/cluster_60:0.013281250000000001 - cluster/prob_snapshot/cluster_61:0.015625000000000003 - cluster/prob_snapshot/cluster_62:0.017968750000000002 - cluster/prob_snapshot/cluster_63:0.015625000000000003
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 12:43:54,664:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   5%|▌         | 42/800 [1:12:59<21:44:59, 103.30s/it]
[36m(TaskRunner pid=2823680)[0m step:42 - global_seqlen/min:396939 - global_seqlen/max:437950 - global_seqlen/minmax_diff:41011 - global_seqlen/balanced_min:409149 - global_seqlen/balanced_max:409369 - global_seqlen/mean:409260.25 - frontier/skipped_zero_acc_count:60.0 - actor/entropy:np.float64(0.27428823753314857) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.007941900752484798 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0562624161648273) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000138278575564875) - actor/ppo_kl:np.float64(-1.77174463728721e-05) - actor/pg_clipfrac_lower:np.float64(3.334062325045713e-07) - actor/grad_norm:np.float64(0.20718500514825186) - perf/mfu/actor:np.float64(0.26932548774150417) - perf/max_memory_allocated_gb:np.float64(75.54081106185913) - perf/max_memory_reserved_gb:np.float64(81.859375) - perf/cpu_memory_used_gb:np.float64(114.41609764099121) - actor/lr:np.float64(1e-06) - training/global_step:42 - training/epoch:0 - critic/score/mean:0.5367646813392639 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5301499366760254 - critic/rewards/max:1.0030030012130737 - critic/rewards/min:-0.050603047013282776 - critic/advantages/mean:-0.17487186193466187 - critic/advantages/max:2.4748427867889404 - critic/advantages/min:-2.4748618602752686 - critic/returns/mean:-0.17487186193466187 - critic/returns/max:2.4748427867889404 - critic/returns/min:-2.4748618602752686 - response_length/mean:1154.03857421875 - response_length/max:8192.0 - response_length/min:207.0 - response_length/clip_ratio:0.0018382353009656072 - response_length_non_aborted/mean:1154.03857421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:207.0 - response_length_non_aborted/clip_ratio:0.0018382353009656072 - response/aborted_ratio:0.0 - prompt_length/mean:242.26470947265625 - prompt_length/max:816.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.9940225481987e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.37276802584528923) - timing_s/agent_loop/generate_sequences/max:np.float64(28.50852425582707) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.408918923056262) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.50852425582707) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:282 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.162233856506646 - timing_s/reward:0.00011973083019256592 - timing_s/old_log_prob:7.0201839515939355 - timing_s/ref:10.755849221721292 - timing_s/adv:0.04671001806855202 - timing_s/update_actor:17.733714709989727 - timing_s/update_weights:18.4068168932572 - timing_s/step:85.04281487502158 - timing_s/stop_profile:5.411170423030853e-05 - timing_per_token_ms/adv:6.149380529279915e-05 - timing_per_token_ms/update_actor:0.023346460664898684 - timing_per_token_ms/gen:0.04804456513252954 - timing_per_token_ms/ref:0.014160090814534296 - perf/total_num_tokens:1637041 - perf/time_per_step:85.04281487502158 - perf/throughput:4812.402442245667 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:141.0 - frontier/mean_score:1.9887968749999998 - frontier/mean_frontier_pct:0.052902249332020854 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.3 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.3 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.91 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.09 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.7 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:1.7 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.7 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.7 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.51 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:42.0 - cluster/prob_snapshot/cluster_0:0.01806997006670176 - cluster/prob_snapshot/cluster_1:0.01571301744930588 - cluster/prob_snapshot/cluster_2:0.01571301744930588 - cluster/prob_snapshot/cluster_3:0.01571301744930588 - cluster/prob_snapshot/cluster_4:0.01571301744930588 - cluster/prob_snapshot/cluster_5:0.01806997006670176 - cluster/prob_snapshot/cluster_6:0.013356064831909997 - cluster/prob_snapshot/cluster_7:0.01571301744930588 - cluster/prob_snapshot/cluster_8:0.01806997006670176 - cluster/prob_snapshot/cluster_9:0.013356064831909997 - cluster/prob_snapshot/cluster_10:0.01571301744930588 - cluster/prob_snapshot/cluster_11:0.015005931664087114 - cluster/prob_snapshot/cluster_12:0.013851024881563131 - cluster/prob_snapshot/cluster_13:0.019719836898878875 - cluster/prob_snapshot/cluster_14:0.01642010323452464 - cluster/prob_snapshot/cluster_15:0.019719836898878875 - cluster/prob_snapshot/cluster_16:0.01806997006670176 - cluster/prob_snapshot/cluster_17:0.013356064831909997 - cluster/prob_snapshot/cluster_18:0.01806997006670176 - cluster/prob_snapshot/cluster_19:0.01571301744930588 - cluster/prob_snapshot/cluster_20:0.013356064831909997 - cluster/prob_snapshot/cluster_21:0.01806997006670176 - cluster/prob_snapshot/cluster_22:0.013356064831909997 - cluster/prob_snapshot/cluster_23:0.01571301744930588 - cluster/prob_snapshot/cluster_24:0.01571301744930588 - cluster/prob_snapshot/cluster_25:0.01571301744930588 - cluster/prob_snapshot/cluster_26:0.013356064831909997 - cluster/prob_snapshot/cluster_27:0.019719836898878875 - cluster/prob_snapshot/cluster_28:0.01571301744930588 - cluster/prob_snapshot/cluster_29:0.01170619799973288 - cluster/prob_snapshot/cluster_30:0.01571301744930588 - cluster/prob_snapshot/cluster_31:0.01571301744930588 - cluster/prob_snapshot/cluster_32:0.01571301744930588 - cluster/prob_snapshot/cluster_33:0.013356064831909997 - cluster/prob_snapshot/cluster_34:0.013356064831909997 - cluster/prob_snapshot/cluster_35:0.01571301744930588 - cluster/prob_snapshot/cluster_36:0.01806997006670176 - cluster/prob_snapshot/cluster_37:0.013356064831909997 - cluster/prob_snapshot/cluster_38:0.013356064831909997 - cluster/prob_snapshot/cluster_39:0.01571301744930588 - cluster/prob_snapshot/cluster_40:0.01571301744930588 - cluster/prob_snapshot/cluster_41:0.01806997006670176 - cluster/prob_snapshot/cluster_42:0.013356064831909997 - cluster/prob_snapshot/cluster_43:0.01806997006670176 - cluster/prob_snapshot/cluster_44:0.01170619799973288 - cluster/prob_snapshot/cluster_45:0.01571301744930588 - cluster/prob_snapshot/cluster_46:0.01806997006670176 - cluster/prob_snapshot/cluster_47:0.01806997006670176 - cluster/prob_snapshot/cluster_48:0.01571301744930588 - cluster/prob_snapshot/cluster_49:0.013356064831909997 - cluster/prob_snapshot/cluster_50:0.013356064831909997 - cluster/prob_snapshot/cluster_51:0.013356064831909997 - cluster/prob_snapshot/cluster_52:0.01571301744930588 - cluster/prob_snapshot/cluster_53:0.013356064831909997 - cluster/prob_snapshot/cluster_54:0.01571301744930588 - cluster/prob_snapshot/cluster_55:0.01571301744930588 - cluster/prob_snapshot/cluster_56:0.013356064831909997 - cluster/prob_snapshot/cluster_57:0.01571301744930588 - cluster/prob_snapshot/cluster_58:0.019719836898878875 - cluster/prob_snapshot/cluster_59:0.01571301744930588 - cluster/prob_snapshot/cluster_60:0.013356064831909997 - cluster/prob_snapshot/cluster_61:0.01571301744930588 - cluster/prob_snapshot/cluster_62:0.01806997006670176 - cluster/prob_snapshot/cluster_63:0.01571301744930588
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 12:45:23,602:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   5%|▌         | 43/800 [1:14:37<21:25:26, 101.88s/it]
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 12:45:25,960:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:43 - global_seqlen/min:384526 - global_seqlen/max:432026 - global_seqlen/minmax_diff:47500 - global_seqlen/balanced_min:405649 - global_seqlen/balanced_max:405856 - global_seqlen/mean:405769.75 - frontier/skipped_zero_acc_count:50.0 - actor/entropy:np.float64(0.30943268633041626) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.006504089571535587 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.015099246895260876) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000223891255583835) - actor/ppo_kl:np.float64(-5.450728142856557e-05) - actor/pg_clipfrac_lower:np.float64(2.2732995906456685e-06) - actor/grad_norm:np.float64(0.21581896245479584) - perf/mfu/actor:np.float64(0.29909592706952914) - perf/max_memory_allocated_gb:np.float64(75.54081106185913) - perf/max_memory_reserved_gb:np.float64(81.859375) - perf/cpu_memory_used_gb:np.float64(114.62110900878906) - actor/lr:np.float64(1e-06) - training/global_step:43 - training/epoch:0 - critic/score/mean:0.45673078298568726 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.45035842061042786 - critic/rewards/max:1.0047433376312256 - critic/rewards/min:-0.06667677313089371 - critic/advantages/mean:-0.10611368715763092 - critic/advantages/max:2.4748497009277344 - critic/advantages/min:-2.474855661392212 - critic/returns/mean:-0.10611368715763092 - critic/returns/max:2.4748497009277344 - critic/returns/min:-2.474855661392212 - response_length/mean:1215.080078125 - response_length/max:8192.0 - response_length/min:143.0 - response_length/clip_ratio:0.0016025641234591603 - response_length_non_aborted/mean:1215.080078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:143.0 - response_length_non_aborted/clip_ratio:0.0016025641234591603 - response/aborted_ratio:0.0 - prompt_length/mean:240.25640869140625 - prompt_length/max:381.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.744560182094574e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1672243252396584) - timing_s/agent_loop/generate_sequences/max:np.float64(28.825956646353006) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.521956742491966) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.825956646353006) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:222 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.958254400640726 - timing_s/reward:0.00010984111577272415 - timing_s/old_log_prob:7.84413767978549 - timing_s/ref:18.96123427990824 - timing_s/adv:0.06476176902651787 - timing_s/update_actor:15.875060520134866 - timing_s/update_weights:24.28483695629984 - timing_s/step:98.38648465741426 - timing_s/stop_profile:6.551574915647507e-05 - timing_per_token_ms/adv:7.131332411275684e-05 - timing_per_token_ms/update_actor:0.01748104403569408 - timing_per_token_ms/gen:0.040830712336477656 - timing_per_token_ms/ref:0.020879427262515543 - perf/total_num_tokens:1623079 - perf/time_per_step:98.38648465741426 - perf/throughput:4124.242790185123 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:191.0 - frontier/mean_score:1.9756249999999997 - frontier/mean_frontier_pct:0.07096626214202609 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.3 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.91 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.09 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.7 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:1.7 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.49 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:1.49 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:43.0 - cluster/prob_snapshot/cluster_0:0.018190446061372986 - cluster/prob_snapshot/cluster_1:0.015817779183802595 - cluster/prob_snapshot/cluster_2:0.015817779183802595 - cluster/prob_snapshot/cluster_3:0.015817779183802595 - cluster/prob_snapshot/cluster_4:0.015817779183802595 - cluster/prob_snapshot/cluster_5:0.018190446061372986 - cluster/prob_snapshot/cluster_6:0.013445112306232207 - cluster/prob_snapshot/cluster_7:0.018190446061372986 - cluster/prob_snapshot/cluster_8:0.019851312875672256 - cluster/prob_snapshot/cluster_9:0.013445112306232207 - cluster/prob_snapshot/cluster_10:0.015817779183802595 - cluster/prob_snapshot/cluster_11:0.01510597912053148 - cluster/prob_snapshot/cluster_12:0.013943372350521987 - cluster/prob_snapshot/cluster_13:0.019851312875672256 - cluster/prob_snapshot/cluster_14:0.01652957924707371 - cluster/prob_snapshot/cluster_15:0.019851312875672256 - cluster/prob_snapshot/cluster_16:0.018190446061372986 - cluster/prob_snapshot/cluster_17:0.013445112306232207 - cluster/prob_snapshot/cluster_18:0.018190446061372986 - cluster/prob_snapshot/cluster_19:0.015817779183802595 - cluster/prob_snapshot/cluster_20:0.013445112306232207 - cluster/prob_snapshot/cluster_21:0.018190446061372986 - cluster/prob_snapshot/cluster_22:0.013445112306232207 - cluster/prob_snapshot/cluster_23:0.015817779183802595 - cluster/prob_snapshot/cluster_24:0.018190446061372986 - cluster/prob_snapshot/cluster_25:0.013445112306232207 - cluster/prob_snapshot/cluster_26:0.013445112306232207 - cluster/prob_snapshot/cluster_27:0.019851312875672256 - cluster/prob_snapshot/cluster_28:0.013445112306232207 - cluster/prob_snapshot/cluster_29:0.011784245491932934 - cluster/prob_snapshot/cluster_30:0.015817779183802595 - cluster/prob_snapshot/cluster_31:0.015817779183802595 - cluster/prob_snapshot/cluster_32:0.013445112306232207 - cluster/prob_snapshot/cluster_33:0.013445112306232207 - cluster/prob_snapshot/cluster_34:0.013445112306232207 - cluster/prob_snapshot/cluster_35:0.015817779183802595 - cluster/prob_snapshot/cluster_36:0.018190446061372986 - cluster/prob_snapshot/cluster_37:0.013445112306232207 - cluster/prob_snapshot/cluster_38:0.011784245491932934 - cluster/prob_snapshot/cluster_39:0.013445112306232207 - cluster/prob_snapshot/cluster_40:0.015817779183802595 - cluster/prob_snapshot/cluster_41:0.019851312875672256 - cluster/prob_snapshot/cluster_42:0.013445112306232207 - cluster/prob_snapshot/cluster_43:0.019851312875672256 - cluster/prob_snapshot/cluster_44:0.011784245491932934 - cluster/prob_snapshot/cluster_45:0.015817779183802595 - cluster/prob_snapshot/cluster_46:0.018190446061372986 - cluster/prob_snapshot/cluster_47:0.019851312875672256 - cluster/prob_snapshot/cluster_48:0.013445112306232207 - cluster/prob_snapshot/cluster_49:0.013445112306232207 - cluster/prob_snapshot/cluster_50:0.013445112306232207 - cluster/prob_snapshot/cluster_51:0.011784245491932934 - cluster/prob_snapshot/cluster_52:0.013445112306232207 - cluster/prob_snapshot/cluster_53:0.013445112306232207 - cluster/prob_snapshot/cluster_54:0.015817779183802595 - cluster/prob_snapshot/cluster_55:0.015817779183802595 - cluster/prob_snapshot/cluster_56:0.011784245491932934 - cluster/prob_snapshot/cluster_57:0.015817779183802595 - cluster/prob_snapshot/cluster_58:0.021013919645681747 - cluster/prob_snapshot/cluster_59:0.015817779183802595 - cluster/prob_snapshot/cluster_60:0.013445112306232207 - cluster/prob_snapshot/cluster_61:0.015817779183802595 - cluster/prob_snapshot/cluster_62:0.018190446061372986 - cluster/prob_snapshot/cluster_63:0.015817779183802595
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 12:47:07,017:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▌         | 44/800 [1:16:30<22:02:24, 104.95s/it]
[36m(TaskRunner pid=2823680)[0m step:44 - global_seqlen/min:373676 - global_seqlen/max:451205 - global_seqlen/minmax_diff:77529 - global_seqlen/balanced_min:414419 - global_seqlen/balanced_max:414466 - global_seqlen/mean:414438.75 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.2894288582106431) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008307166397571564 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.013423652213532478) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00023956478738303608) - actor/ppo_kl:np.float64(9.841617860217866e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2150181531906128) - perf/mfu/actor:np.float64(0.2459330288915379) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.88625717163086) - actor/lr:np.float64(1e-06) - training/global_step:44 - training/epoch:0 - critic/score/mean:0.5028089880943298 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.49486592411994934 - critic/rewards/max:1.0019248723983765 - critic/rewards/min:-0.04734867811203003 - critic/advantages/mean:-0.23318776488304138 - critic/advantages/max:2.474856376647949 - critic/advantages/min:-2.4748542308807373 - critic/returns/mean:-0.23318776488304138 - critic/returns/max:2.474856376647949 - critic/returns/min:-2.4748542308807373 - response_length/mean:1311.3497314453125 - response_length/max:8192.0 - response_length/min:178.0 - response_length/clip_ratio:0.008426966145634651 - response_length_non_aborted/mean:1311.3497314453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:178.0 - response_length_non_aborted/clip_ratio:0.008426966145634651 - response/aborted_ratio:0.0 - prompt_length/mean:239.7528076171875 - prompt_length/max:535.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.34479758143425e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3865255462005734) - timing_s/agent_loop/generate_sequences/max:np.float64(28.615878503769636) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.400175798686178) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.615878503769636) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.448425511829555 - timing_s/reward:0.00015895627439022064 - timing_s/old_log_prob:9.528423097915947 - timing_s/ref:21.533280936069787 - timing_s/adv:0.07846284937113523 - timing_s/update_actor:19.776218592189252 - timing_s/update_weights:30.12523986119777 - timing_s/step:111.87226998526603 - timing_s/stop_profile:5.3944066166877747e-05 - timing_per_token_ms/adv:7.104664530135346e-05 - timing_per_token_ms/update_actor:0.017906996737722128 - timing_per_token_ms/gen:0.0326111653892813 - timing_per_token_ms/ref:0.019497983887928384 - perf/total_num_tokens:1657755 - perf/time_per_step:111.87226998526603 - perf/throughput:3704.57084722231 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:230.0 - frontier/mean_score:1.9634375 - frontier/mean_frontier_pct:0.08958038351216521 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.3 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.91 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.3629999999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.6569999999999996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:1.91 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.7 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.51 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.3 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.09 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.49 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.343 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:44.0 - cluster/prob_snapshot/cluster_0:0.01830335826834315 - cluster/prob_snapshot/cluster_1:0.015915963711602737 - cluster/prob_snapshot/cluster_2:0.013528569154862327 - cluster/prob_snapshot/cluster_3:0.015915963711602737 - cluster/prob_snapshot/cluster_4:0.015915963711602737 - cluster/prob_snapshot/cluster_5:0.01830335826834315 - cluster/prob_snapshot/cluster_6:0.013528569154862327 - cluster/prob_snapshot/cluster_7:0.01830335826834315 - cluster/prob_snapshot/cluster_8:0.019974534458061435 - cluster/prob_snapshot/cluster_9:0.013528569154862327 - cluster/prob_snapshot/cluster_10:0.01830335826834315 - cluster/prob_snapshot/cluster_11:0.015199745344580615 - cluster/prob_snapshot/cluster_12:0.014029922011777813 - cluster/prob_snapshot/cluster_13:0.019974534458061435 - cluster/prob_snapshot/cluster_14:0.01880471112525863 - cluster/prob_snapshot/cluster_15:0.021144357790864234 - cluster/prob_snapshot/cluster_16:0.015199745344580615 - cluster/prob_snapshot/cluster_17:0.013528569154862327 - cluster/prob_snapshot/cluster_18:0.015199745344580615 - cluster/prob_snapshot/cluster_19:0.015915963711602737 - cluster/prob_snapshot/cluster_20:0.013528569154862327 - cluster/prob_snapshot/cluster_21:0.01830335826834315 - cluster/prob_snapshot/cluster_22:0.013528569154862327 - cluster/prob_snapshot/cluster_23:0.015915963711602737 - cluster/prob_snapshot/cluster_24:0.019974534458061435 - cluster/prob_snapshot/cluster_25:0.013528569154862327 - cluster/prob_snapshot/cluster_26:0.013528569154862327 - cluster/prob_snapshot/cluster_27:0.019974534458061435 - cluster/prob_snapshot/cluster_28:0.013528569154862327 - cluster/prob_snapshot/cluster_29:0.011857392965144039 - cluster/prob_snapshot/cluster_30:0.015915963711602737 - cluster/prob_snapshot/cluster_31:0.015915963711602737 - cluster/prob_snapshot/cluster_32:0.013528569154862327 - cluster/prob_snapshot/cluster_33:0.013528569154862327 - cluster/prob_snapshot/cluster_34:0.011857392965144039 - cluster/prob_snapshot/cluster_35:0.01830335826834315 - cluster/prob_snapshot/cluster_36:0.01830335826834315 - cluster/prob_snapshot/cluster_37:0.013528569154862327 - cluster/prob_snapshot/cluster_38:0.011857392965144039 - cluster/prob_snapshot/cluster_39:0.01663218207862486 - cluster/prob_snapshot/cluster_40:0.015915963711602737 - cluster/prob_snapshot/cluster_41:0.021144357790864234 - cluster/prob_snapshot/cluster_42:0.013528569154862327 - cluster/prob_snapshot/cluster_43:0.019974534458061435 - cluster/prob_snapshot/cluster_44:0.011857392965144039 - cluster/prob_snapshot/cluster_45:0.013528569154862327 - cluster/prob_snapshot/cluster_46:0.01830335826834315 - cluster/prob_snapshot/cluster_47:0.019974534458061435 - cluster/prob_snapshot/cluster_48:0.013528569154862327 - cluster/prob_snapshot/cluster_49:0.013528569154862327 - cluster/prob_snapshot/cluster_50:0.013528569154862327 - cluster/prob_snapshot/cluster_51:0.011857392965144039 - cluster/prob_snapshot/cluster_52:0.011857392965144039 - cluster/prob_snapshot/cluster_53:0.013528569154862327 - cluster/prob_snapshot/cluster_54:0.013528569154862327 - cluster/prob_snapshot/cluster_55:0.015915963711602737 - cluster/prob_snapshot/cluster_56:0.010687569632341238 - cluster/prob_snapshot/cluster_57:0.015915963711602737 - cluster/prob_snapshot/cluster_58:0.021144357790864234 - cluster/prob_snapshot/cluster_59:0.015915963711602737 - cluster/prob_snapshot/cluster_60:0.013528569154862327 - cluster/prob_snapshot/cluster_61:0.015915963711602737 - cluster/prob_snapshot/cluster_62:0.01830335826834315 - cluster/prob_snapshot/cluster_63:0.013528569154862327
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▌         | 45/800 [1:18:00<21:07:23, 100.72s/it]
[36m(TaskRunner pid=2823680)[0m step:45 - global_seqlen/min:356679 - global_seqlen/max:432331 - global_seqlen/minmax_diff:75652 - global_seqlen/balanced_min:386258 - global_seqlen/balanced_max:386326 - global_seqlen/mean:386304.75 - frontier/skipped_zero_acc_count:54.0 - actor/entropy:np.float64(0.2772844089446841) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009336303919553757 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.039934451051522046) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00020260639226959225) - actor/ppo_kl:np.float64(-2.7406959757909873e-05) - actor/pg_clipfrac_lower:np.float64(2.44987549612651e-06) - actor/grad_norm:np.float64(0.21058411598205568) - perf/mfu/actor:np.float64(0.28180553279433473) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.13240814208984) - actor/lr:np.float64(1e-06) - training/global_step:45 - training/epoch:0 - critic/score/mean:0.5 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4924241006374359 - critic/rewards/max:1.0022574663162231 - critic/rewards/min:-0.05840279906988144 - critic/advantages/mean:-0.15064407885074615 - critic/advantages/max:2.47485089302063 - critic/advantages/min:-2.474846363067627 - critic/returns/mean:-0.15064407885074615 - critic/returns/max:2.47485089302063 - critic/returns/min:-2.474846363067627 - response_length/mean:1137.1553955078125 - response_length/max:8192.0 - response_length/min:209.0 - response_length/clip_ratio:0.0033783784601837397 - response_length_non_aborted/mean:1137.1553955078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:209.0 - response_length_non_aborted/clip_ratio:0.0033783784601837397 - response/aborted_ratio:0.0 - prompt_length/mean:234.8783721923828 - prompt_length/max:360.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.591730147600174e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6669518826529384) - timing_s/agent_loop/generate_sequences/max:np.float64(28.39188228547573) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.095886538807463) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.39188228547573) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:220 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.95466147363186 - timing_s/reward:0.00012060534209012985 - timing_s/old_log_prob:8.248190981335938 - timing_s/ref:14.488858087919652 - timing_s/adv:0.0642647035419941 - timing_s/update_actor:15.99266769643873 - timing_s/update_weights:20.50260625127703 - timing_s/step:90.61743375752121 - timing_s/stop_profile:5.059689283370972e-05 - timing_per_token_ms/adv:7.911994861395602e-05 - timing_per_token_ms/update_actor:0.019689487021681575 - timing_per_token_ms/gen:0.04598164795042136 - timing_per_token_ms/ref:0.017838061084993737 - perf/total_num_tokens:1545219 - perf/time_per_step:90.61743375752121 - perf/throughput:4263.029021917506 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:284.0 - frontier/mean_score:1.9490140624999999 - frontier/mean_frontier_pct:0.10663620551693065 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.51 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.637 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.0569999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.3629999999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.6569999999999996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:1.91 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.09 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.91 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.3 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.3 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:1.763 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.343 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.343 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.7598999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:45.0 - cluster/prob_snapshot/cluster_0:0.02012235352970933 - cluster/prob_snapshot/cluster_1:0.016033747832437717 - cluster/prob_snapshot/cluster_2:0.013628685657572059 - cluster/prob_snapshot/cluster_3:0.018438810007303373 - cluster/prob_snapshot/cluster_4:0.016033747832437717 - cluster/prob_snapshot/cluster_5:0.018438810007303373 - cluster/prob_snapshot/cluster_6:0.013628685657572059 - cluster/prob_snapshot/cluster_7:0.018438810007303373 - cluster/prob_snapshot/cluster_8:0.016490709645662186 - cluster/prob_snapshot/cluster_9:0.011945142135166098 - cluster/prob_snapshot/cluster_10:0.018438810007303373 - cluster/prob_snapshot/cluster_11:0.01312362260085027 - cluster/prob_snapshot/cluster_12:0.014133748714293846 - cluster/prob_snapshot/cluster_13:0.016490709645662186 - cluster/prob_snapshot/cluster_14:0.018943873064025157 - cluster/prob_snapshot/cluster_15:0.021300833995393502 - cluster/prob_snapshot/cluster_16:0.015312229179978019 - cluster/prob_snapshot/cluster_17:0.011945142135166098 - cluster/prob_snapshot/cluster_18:0.015312229179978019 - cluster/prob_snapshot/cluster_19:0.016033747832437717 - cluster/prob_snapshot/cluster_20:0.01675526648489741 - cluster/prob_snapshot/cluster_21:0.015312229179978019 - cluster/prob_snapshot/cluster_22:0.013628685657572059 - cluster/prob_snapshot/cluster_23:0.016033747832437717 - cluster/prob_snapshot/cluster_24:0.021300833995393502 - cluster/prob_snapshot/cluster_25:0.013628685657572059 - cluster/prob_snapshot/cluster_26:0.013628685657572059 - cluster/prob_snapshot/cluster_27:0.02012235352970933 - cluster/prob_snapshot/cluster_28:0.013628685657572059 - cluster/prob_snapshot/cluster_29:0.011945142135166098 - cluster/prob_snapshot/cluster_30:0.018438810007303373 - cluster/prob_snapshot/cluster_31:0.016033747832437717 - cluster/prob_snapshot/cluster_32:0.011945142135166098 - cluster/prob_snapshot/cluster_33:0.013628685657572059 - cluster/prob_snapshot/cluster_34:0.011945142135166098 - cluster/prob_snapshot/cluster_35:0.018438810007303373 - cluster/prob_snapshot/cluster_36:0.018438810007303373 - cluster/prob_snapshot/cluster_37:0.013628685657572059 - cluster/prob_snapshot/cluster_38:0.011945142135166098 - cluster/prob_snapshot/cluster_39:0.014133748714293846 - cluster/prob_snapshot/cluster_40:0.016033747832437717 - cluster/prob_snapshot/cluster_41:0.021300833995393502 - cluster/prob_snapshot/cluster_42:0.013628685657572059 - cluster/prob_snapshot/cluster_43:0.02012235352970933 - cluster/prob_snapshot/cluster_44:0.011945142135166098 - cluster/prob_snapshot/cluster_45:0.013628685657572059 - cluster/prob_snapshot/cluster_46:0.018438810007303373 - cluster/prob_snapshot/cluster_47:0.02012235352970933 - cluster/prob_snapshot/cluster_48:0.013628685657572059 - cluster/prob_snapshot/cluster_49:0.013628685657572059 - cluster/prob_snapshot/cluster_50:0.013628685657572059 - cluster/prob_snapshot/cluster_51:0.010766661669481927 - cluster/prob_snapshot/cluster_52:0.011945142135166098 - cluster/prob_snapshot/cluster_53:0.013628685657572059 - cluster/prob_snapshot/cluster_54:0.013628685657572059 - cluster/prob_snapshot/cluster_55:0.018438810007303373 - cluster/prob_snapshot/cluster_56:0.010766661669481927 - cluster/prob_snapshot/cluster_57:0.016033747832437717 - cluster/prob_snapshot/cluster_58:0.02212577032137242 - cluster/prob_snapshot/cluster_59:0.016033747832437717 - cluster/prob_snapshot/cluster_60:0.013628685657572059 - cluster/prob_snapshot/cluster_61:0.016033747832437717 - cluster/prob_snapshot/cluster_62:0.018438810007303373 - cluster/prob_snapshot/cluster_63:0.013628685657572059
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 12:50:27,987:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▌         | 46/800 [1:20:04<22:32:06, 107.60s/it]
[36m(TaskRunner pid=2823680)[0m step:46 - global_seqlen/min:371527 - global_seqlen/max:456968 - global_seqlen/minmax_diff:85441 - global_seqlen/balanced_min:411983 - global_seqlen/balanced_max:412076 - global_seqlen/mean:412034.0 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.2865053029774233) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008831365965306759 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04188407334731892) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00022493468409371672) - actor/ppo_kl:np.float64(-1.9777896763801354e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.21997244385155765) - perf/mfu/actor:np.float64(0.2456705787720681) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(107.36945724487305) - actor/lr:np.float64(1e-06) - training/global_step:46 - training/epoch:0 - critic/score/mean:0.47058823704719543 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.46245077252388 - critic/rewards/max:1.0056318044662476 - critic/rewards/min:-0.04445118457078934 - critic/advantages/mean:-0.13268227875232697 - critic/advantages/max:2.474825620651245 - critic/advantages/min:-2.4748332500457764 - critic/returns/mean:-0.13268227875232697 - critic/returns/max:2.474825620651245 - critic/returns/min:-2.4748332500457764 - response_length/mean:1259.1500244140625 - response_length/max:8192.0 - response_length/min:160.0 - response_length/clip_ratio:0.0117647061124444 - response_length_non_aborted/mean:1259.1500244140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:160.0 - response_length_non_aborted/clip_ratio:0.0117647061124444 - response/aborted_ratio:0.0 - prompt_length/mean:241.8235321044922 - prompt_length/max:497.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.01147723197937e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3007347090169787) - timing_s/agent_loop/generate_sequences/max:np.float64(29.051088478416204) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.566162092830382) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.051088478416204) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:255 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.498517300002277 - timing_s/reward:0.0001023169606924057 - timing_s/old_log_prob:9.266192737966776 - timing_s/ref:26.222737718373537 - timing_s/adv:0.0657326839864254 - timing_s/update_actor:19.720556383021176 - timing_s/update_weights:37.18783219624311 - timing_s/step:123.34679863695055 - timing_s/stop_profile:5.5701471865177155e-05 - timing_per_token_ms/adv:6.440200966277318e-05 - timing_per_token_ms/update_actor:0.019321338879101188 - timing_per_token_ms/gen:0.03561987113155499 - timing_per_token_ms/ref:0.025691891848989713 - perf/total_num_tokens:1648136 - perf/time_per_step:123.34679863695055 - perf/throughput:3340.451511941944 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:327.0 - frontier/mean_score:1.9498859374999997 - frontier/mean_frontier_pct:0.12178766070927907 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.51 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.0458999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.0569999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:1.9540999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.6569999999999996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:1.91 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.09 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.91 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.7598999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.09 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.51 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.51 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:1.763 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.6569999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.7 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.343 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.09 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.343 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.7598999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.49 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:46.0 - cluster/prob_snapshot/cluster_0:0.02011335598957311 - cluster/prob_snapshot/cluster_1:0.013622591706085374 - cluster/prob_snapshot/cluster_2:0.013622591706085374 - cluster/prob_snapshot/cluster_3:0.01843056524940962 - cluster/prob_snapshot/cluster_4:0.016026578477747497 - cluster/prob_snapshot/cluster_5:0.01843056524940962 - cluster/prob_snapshot/cluster_6:0.013622591706085374 - cluster/prob_snapshot/cluster_7:0.01843056524940962 - cluster/prob_snapshot/cluster_8:0.013942321946716433 - cluster/prob_snapshot/cluster_9:0.011939800965921886 - cluster/prob_snapshot/cluster_10:0.01843056524940962 - cluster/prob_snapshot/cluster_11:0.0163943884538118 - cluster/prob_snapshot/cluster_12:0.01412742892813442 - cluster/prob_snapshot/cluster_13:0.0164833359643633 - cluster/prob_snapshot/cluster_14:0.01565876850168319 - cluster/prob_snapshot/cluster_15:0.02129130950768755 - cluster/prob_snapshot/cluster_16:0.01530538244624886 - cluster/prob_snapshot/cluster_17:0.010761847447807445 - cluster/prob_snapshot/cluster_18:0.01530538244624886 - cluster/prob_snapshot/cluster_19:0.016026578477747497 - cluster/prob_snapshot/cluster_20:0.016747774509246135 - cluster/prob_snapshot/cluster_21:0.01530538244624886 - cluster/prob_snapshot/cluster_22:0.013622591706085374 - cluster/prob_snapshot/cluster_23:0.013622591706085374 - cluster/prob_snapshot/cluster_24:0.022115876970367657 - cluster/prob_snapshot/cluster_25:0.013622591706085374 - cluster/prob_snapshot/cluster_26:0.013622591706085374 - cluster/prob_snapshot/cluster_27:0.02011335598957311 - cluster/prob_snapshot/cluster_28:0.016747774509246135 - cluster/prob_snapshot/cluster_29:0.011939800965921886 - cluster/prob_snapshot/cluster_30:0.02011335598957311 - cluster/prob_snapshot/cluster_31:0.013622591706085374 - cluster/prob_snapshot/cluster_32:0.011939800965921886 - cluster/prob_snapshot/cluster_33:0.013622591706085374 - cluster/prob_snapshot/cluster_34:0.011939800965921886 - cluster/prob_snapshot/cluster_35:0.02011335598957311 - cluster/prob_snapshot/cluster_36:0.01843056524940962 - cluster/prob_snapshot/cluster_37:0.016747774509246135 - cluster/prob_snapshot/cluster_38:0.011939800965921886 - cluster/prob_snapshot/cluster_39:0.01412742892813442 - cluster/prob_snapshot/cluster_40:0.016026578477747497 - cluster/prob_snapshot/cluster_41:0.02129130950768755 - cluster/prob_snapshot/cluster_42:0.011939800965921886 - cluster/prob_snapshot/cluster_43:0.02011335598957311 - cluster/prob_snapshot/cluster_44:0.011939800965921886 - cluster/prob_snapshot/cluster_45:0.013622591706085374 - cluster/prob_snapshot/cluster_46:0.01843056524940962 - cluster/prob_snapshot/cluster_47:0.02129130950768755 - cluster/prob_snapshot/cluster_48:0.013622591706085374 - cluster/prob_snapshot/cluster_49:0.013622591706085374 - cluster/prob_snapshot/cluster_50:0.013622591706085374 - cluster/prob_snapshot/cluster_51:0.010761847447807445 - cluster/prob_snapshot/cluster_52:0.011939800965921886 - cluster/prob_snapshot/cluster_53:0.013622591706085374 - cluster/prob_snapshot/cluster_54:0.016747774509246135 - cluster/prob_snapshot/cluster_55:0.01843056524940962 - cluster/prob_snapshot/cluster_56:0.010761847447807445 - cluster/prob_snapshot/cluster_57:0.016026578477747497 - cluster/prob_snapshot/cluster_58:0.022115876970367657 - cluster/prob_snapshot/cluster_59:0.016026578477747497 - cluster/prob_snapshot/cluster_60:0.013622591706085374 - cluster/prob_snapshot/cluster_61:0.016026578477747497 - cluster/prob_snapshot/cluster_62:0.01843056524940962 - cluster/prob_snapshot/cluster_63:0.011939800965921886
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 12:52:28,100:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▌         | 47/800 [1:21:46<22:08:31, 105.86s/it]
[36m(TaskRunner pid=2823680)[0m step:47 - global_seqlen/min:275265 - global_seqlen/max:465028 - global_seqlen/minmax_diff:189763 - global_seqlen/balanced_min:391718 - global_seqlen/balanced_max:391825 - global_seqlen/mean:391774.75 - frontier/skipped_zero_acc_count:46.0 - actor/entropy:np.float64(0.30421459047896104) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00947895273566246 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03289291872715694) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021514540972797166) - actor/ppo_kl:np.float64(-2.8478847365686906e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2244584167545492) - perf/mfu/actor:np.float64(0.2738534107968139) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(113.51446151733398) - actor/lr:np.float64(1e-06) - training/global_step:47 - training/epoch:0 - critic/score/mean:0.5274389982223511 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5199055075645447 - critic/rewards/max:1.0014142990112305 - critic/rewards/min:-0.07204250246286392 - critic/advantages/mean:-0.16727280616760254 - critic/advantages/max:2.474858045578003 - critic/advantages/min:-2.474855899810791 - critic/returns/mean:-0.16727280616760254 - critic/returns/max:2.474858045578003 - critic/returns/min:-2.474855899810791 - response_length/mean:1079.4710693359375 - response_length/max:8192.0 - response_length/min:205.0 - response_length/clip_ratio:0.007621951401233673 - response_length_non_aborted/mean:1079.4710693359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:205.0 - response_length_non_aborted/clip_ratio:0.007621951401233673 - response/aborted_ratio:0.0 - prompt_length/mean:239.47561645507812 - prompt_length/max:478.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.043514728546143e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6339116310700774) - timing_s/agent_loop/generate_sequences/max:np.float64(28.851775844581425) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.936535325027762) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.851775844581425) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:212 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.7400444149971 - timing_s/reward:0.0001157764345407486 - timing_s/old_log_prob:8.747145904228091 - timing_s/ref:18.642109361477196 - timing_s/adv:0.06703916937112808 - timing_s/update_actor:16.893194925040007 - timing_s/update_weights:25.115465423092246 - timing_s/step:101.58404454216361 - timing_s/stop_profile:6.840657442808151e-05 - timing_per_token_ms/adv:7.748141748731039e-05 - timing_per_token_ms/update_actor:0.019524536192198835 - timing_per_token_ms/gen:0.044822151227237116 - timing_per_token_ms/ref:0.021545867465696592 - perf/total_num_tokens:1567099 - perf/time_per_step:101.58404454216361 - perf/throughput:3856.6563456467757 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:373.0 - frontier/mean_score:1.9628899999999998 - frontier/mean_frontier_pct:0.14045364913653235 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.51 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.0458999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.0569999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:1.9540999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.6569999999999996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:1.91 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.09 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.237 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.0569999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.6569999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.51 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.49 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:1.763 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.6569999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:1.91 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.6569999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.49 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.343 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.09 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.51 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.8400999999999998 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.8319299999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.49 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:47.0 - cluster/prob_snapshot/cluster_0:0.019980105864312314 - cluster/prob_snapshot/cluster_1:0.013532342617263321 - cluster/prob_snapshot/cluster_2:0.013532342617263321 - cluster/prob_snapshot/cluster_3:0.018308463541003316 - cluster/prob_snapshot/cluster_4:0.01592040307913332 - cluster/prob_snapshot/cluster_5:0.018308463541003316 - cluster/prob_snapshot/cluster_6:0.013532342617263321 - cluster/prob_snapshot/cluster_7:0.018308463541003316 - cluster/prob_snapshot/cluster_8:0.013849954658692029 - cluster/prob_snapshot/cluster_9:0.010690550667638025 - cluster/prob_snapshot/cluster_10:0.018308463541003316 - cluster/prob_snapshot/cluster_11:0.016285776329799426 - cluster/prob_snapshot/cluster_12:0.01403383531425602 - cluster/prob_snapshot/cluster_13:0.016374134566888614 - cluster/prob_snapshot/cluster_14:0.015555029828467208 - cluster/prob_snapshot/cluster_15:0.021150255490628614 - cluster/prob_snapshot/cluster_16:0.01520398494057232 - cluster/prob_snapshot/cluster_17:0.009871445929216615 - cluster/prob_snapshot/cluster_18:0.01520398494057232 - cluster/prob_snapshot/cluster_19:0.01592040307913332 - cluster/prob_snapshot/cluster_20:0.016636821217694317 - cluster/prob_snapshot/cluster_21:0.01780697084401062 - cluster/prob_snapshot/cluster_22:0.013532342617263321 - cluster/prob_snapshot/cluster_23:0.013532342617263321 - cluster/prob_snapshot/cluster_24:0.022542733545945007 - cluster/prob_snapshot/cluster_25:0.013532342617263321 - cluster/prob_snapshot/cluster_26:0.013532342617263321 - cluster/prob_snapshot/cluster_27:0.016374134566888614 - cluster/prob_snapshot/cluster_28:0.018809956237996014 - cluster/prob_snapshot/cluster_29:0.011860700293954323 - cluster/prob_snapshot/cluster_30:0.021150255490628614 - cluster/prob_snapshot/cluster_31:0.013532342617263321 - cluster/prob_snapshot/cluster_32:0.011860700293954323 - cluster/prob_snapshot/cluster_33:0.013532342617263321 - cluster/prob_snapshot/cluster_34:0.011860700293954323 - cluster/prob_snapshot/cluster_35:0.019980105864312314 - cluster/prob_snapshot/cluster_36:0.018308463541003316 - cluster/prob_snapshot/cluster_37:0.016636821217694317 - cluster/prob_snapshot/cluster_38:0.011860700293954323 - cluster/prob_snapshot/cluster_39:0.01403383531425602 - cluster/prob_snapshot/cluster_40:0.023084584464743315 - cluster/prob_snapshot/cluster_41:0.021150255490628614 - cluster/prob_snapshot/cluster_42:0.011860700293954323 - cluster/prob_snapshot/cluster_43:0.021150255490628614 - cluster/prob_snapshot/cluster_44:0.011860700293954323 - cluster/prob_snapshot/cluster_45:0.011860700293954323 - cluster/prob_snapshot/cluster_46:0.01520398494057232 - cluster/prob_snapshot/cluster_47:0.021150255490628614 - cluster/prob_snapshot/cluster_48:0.011860700293954323 - cluster/prob_snapshot/cluster_49:0.013532342617263321 - cluster/prob_snapshot/cluster_50:0.013532342617263321 - cluster/prob_snapshot/cluster_51:0.010690550667638025 - cluster/prob_snapshot/cluster_52:0.011860700293954323 - cluster/prob_snapshot/cluster_53:0.013532342617263321 - cluster/prob_snapshot/cluster_54:0.016636821217694317 - cluster/prob_snapshot/cluster_55:0.019980105864312314 - cluster/prob_snapshot/cluster_56:0.01464756685295661 - cluster/prob_snapshot/cluster_57:0.01592040307913332 - cluster/prob_snapshot/cluster_58:0.022542733545945007 - cluster/prob_snapshot/cluster_59:0.01592040307913332 - cluster/prob_snapshot/cluster_60:0.013532342617263321 - cluster/prob_snapshot/cluster_61:0.013532342617263321 - cluster/prob_snapshot/cluster_62:0.018308463541003316 - cluster/prob_snapshot/cluster_63:0.011860700293954323
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▌         | 48/800 [1:23:33<22:11:54, 106.27s/it]
[36m(TaskRunner pid=2823680)[0m step:48 - global_seqlen/min:349037 - global_seqlen/max:422257 - global_seqlen/minmax_diff:73220 - global_seqlen/balanced_min:392423 - global_seqlen/balanced_max:392523 - global_seqlen/mean:392474.25 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.3012257047174942) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01007020566612482 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.015586392197292298) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00020250235394043904) - actor/ppo_kl:np.float64(5.7856060548390686e-05) - actor/pg_clipfrac_lower:np.float64(1.991428949362931e-07) - actor/grad_norm:np.float64(0.22006447071378882) - perf/mfu/actor:np.float64(0.25246082397930647) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(113.63390731811523) - actor/lr:np.float64(1e-06) - training/global_step:48 - training/epoch:0 - critic/score/mean:0.494047611951828 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.48581236600875854 - critic/rewards/max:1.0035959482192993 - critic/rewards/min:-0.03898397088050842 - critic/advantages/mean:-0.16489708423614502 - critic/advantages/max:2.474844217300415 - critic/advantages/min:-2.4748587608337402 - critic/returns/mean:-0.16489708423614502 - critic/returns/max:2.474844217300415 - critic/returns/min:-2.4748587608337402 - response_length/mean:1231.174072265625 - response_length/max:8192.0 - response_length/min:194.0 - response_length/clip_ratio:0.014880952425301075 - response_length_non_aborted/mean:1231.174072265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:194.0 - response_length_non_aborted/clip_ratio:0.014880952425301075 - response/aborted_ratio:0.0 - prompt_length/mean:232.42857360839844 - prompt_length/max:417.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.265169501304626e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.56125015206635) - timing_s/agent_loop/generate_sequences/max:np.float64(28.80530594382435) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.902339869207026) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.80530594382435) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:277 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.630011063069105 - timing_s/reward:0.00013588368892669678 - timing_s/old_log_prob:9.840499861165881 - timing_s/ref:20.957236437126994 - timing_s/adv:0.06787768565118313 - timing_s/update_actor:18.3022756325081 - timing_s/update_weights:26.832501837983727 - timing_s/step:107.0164899956435 - timing_s/stop_profile:5.501694977283478e-05 - timing_per_token_ms/adv:6.901358016715432e-05 - timing_per_token_ms/update_actor:0.01860855381982866 - timing_per_token_ms/gen:0.037021874762729036 - timing_per_token_ms/ref:0.021307943885539082 - perf/total_num_tokens:1569897 - perf/time_per_step:107.0164899956435 - perf/throughput:3667.418451268371 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:417.0 - frontier/mean_score:1.9758676190476192 - frontier/mean_frontier_pct:0.155464969154014 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.51 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.51 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.0458999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.0569999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:1.9540999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:2.7598999999999996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:1.91 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.09 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.4659 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.49 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.0569999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.6569999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.51 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:1.9429999999999998 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:1.763 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.6569999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:1.49 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.637 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.49 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.343 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.09 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.8400999999999998 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.8319299999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:48.0 - cluster/prob_snapshot/cluster_0:0.02016393682309222 - cluster/prob_snapshot/cluster_1:0.013656849641138157 - cluster/prob_snapshot/cluster_2:0.011969827038409327 - cluster/prob_snapshot/cluster_3:0.018476914220363388 - cluster/prob_snapshot/cluster_4:0.016066881930750773 - cluster/prob_snapshot/cluster_5:0.018476914220363388 - cluster/prob_snapshot/cluster_6:0.013656849641138157 - cluster/prob_snapshot/cluster_7:0.02016393682309222 - cluster/prob_snapshot/cluster_8:0.013977383935656632 - cluster/prob_snapshot/cluster_9:0.010788911216499145 - cluster/prob_snapshot/cluster_10:0.018476914220363388 - cluster/prob_snapshot/cluster_11:0.0164356168710615 - cluster/prob_snapshot/cluster_12:0.014162956421956805 - cluster/prob_snapshot/cluster_13:0.016524788065777167 - cluster/prob_snapshot/cluster_14:0.015698146990440042 - cluster/prob_snapshot/cluster_15:0.022171493720339528 - cluster/prob_snapshot/cluster_16:0.015343872243866989 - cluster/prob_snapshot/cluster_17:0.009962270141162017 - cluster/prob_snapshot/cluster_18:0.015343872243866989 - cluster/prob_snapshot/cluster_19:0.016066881930750773 - cluster/prob_snapshot/cluster_20:0.016789891617634556 - cluster/prob_snapshot/cluster_21:0.019809662076519165 - cluster/prob_snapshot/cluster_22:0.013656849641138157 - cluster/prob_snapshot/cluster_23:0.013656849641138157 - cluster/prob_snapshot/cluster_24:0.022750142473075514 - cluster/prob_snapshot/cluster_25:0.013656849641138157 - cluster/prob_snapshot/cluster_26:0.011969827038409327 - cluster/prob_snapshot/cluster_27:0.016524788065777167 - cluster/prob_snapshot/cluster_28:0.018983021001182036 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0213448526450024 - cluster/prob_snapshot/cluster_31:0.013656849641138157 - cluster/prob_snapshot/cluster_32:0.011969827038409327 - cluster/prob_snapshot/cluster_33:0.013656849641138157 - cluster/prob_snapshot/cluster_34:0.011969827038409327 - cluster/prob_snapshot/cluster_35:0.02016393682309222 - cluster/prob_snapshot/cluster_36:0.015343872243866989 - cluster/prob_snapshot/cluster_37:0.016789891617634556 - cluster/prob_snapshot/cluster_38:0.015608975795724376 - cluster/prob_snapshot/cluster_39:0.014162956421956805 - cluster/prob_snapshot/cluster_40:0.028358046607775113 - cluster/prob_snapshot/cluster_41:0.0213448526450024 - cluster/prob_snapshot/cluster_42:0.011969827038409327 - cluster/prob_snapshot/cluster_43:0.0213448526450024 - cluster/prob_snapshot/cluster_44:0.011969827038409327 - cluster/prob_snapshot/cluster_45:0.011969827038409327 - cluster/prob_snapshot/cluster_46:0.013150742860319508 - cluster/prob_snapshot/cluster_47:0.022171493720339528 - cluster/prob_snapshot/cluster_48:0.011969827038409327 - cluster/prob_snapshot/cluster_49:0.013656849641138157 - cluster/prob_snapshot/cluster_50:0.011969827038409327 - cluster/prob_snapshot/cluster_51:0.010788911216499145 - cluster/prob_snapshot/cluster_52:0.011969827038409327 - cluster/prob_snapshot/cluster_53:0.013656849641138157 - cluster/prob_snapshot/cluster_54:0.016789891617634556 - cluster/prob_snapshot/cluster_55:0.016524788065777167 - cluster/prob_snapshot/cluster_56:0.014782334720387248 - cluster/prob_snapshot/cluster_57:0.018476914220363388 - cluster/prob_snapshot/cluster_58:0.022750142473075514 - cluster/prob_snapshot/cluster_59:0.016066881930750773 - cluster/prob_snapshot/cluster_60:0.013656849641138157 - cluster/prob_snapshot/cluster_61:0.013656849641138157 - cluster/prob_snapshot/cluster_62:0.02016393682309222 - cluster/prob_snapshot/cluster_63:0.010788911216499145
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 12:55:59,706:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▌         | 49/800 [1:25:06<21:18:55, 102.18s/it]
[36m(TaskRunner pid=2823680)[0m step:49 - global_seqlen/min:365741 - global_seqlen/max:478058 - global_seqlen/minmax_diff:112317 - global_seqlen/balanced_min:398199 - global_seqlen/balanced_max:398319 - global_seqlen/mean:398241.75 - frontier/skipped_zero_acc_count:56.0 - actor/entropy:np.float64(0.33236409775498843) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008584569208323956 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02585036186792422) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002225897131009131) - actor/ppo_kl:np.float64(-3.622734505310316e-05) - actor/pg_clipfrac_lower:np.float64(2.37254688626207e-06) - actor/grad_norm:np.float64(0.2398300204012129) - perf/mfu/actor:np.float64(0.2649568463818475) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(113.7652359008789) - actor/lr:np.float64(1e-06) - training/global_step:49 - training/epoch:0 - critic/score/mean:0.4461805522441864 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.43854957818984985 - critic/rewards/max:1.0031431913375854 - critic/rewards/min:-0.06352488696575165 - critic/advantages/mean:-0.14405901730060577 - critic/advantages/max:2.4748566150665283 - critic/advantages/min:-2.4748470783233643 - critic/returns/mean:-0.14405901730060577 - critic/returns/max:2.4748566150665283 - critic/returns/min:-2.4748470783233643 - response_length/mean:1217.82470703125 - response_length/max:8192.0 - response_length/min:278.0 - response_length/clip_ratio:0.0086805559694767 - response_length_non_aborted/mean:1217.82470703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:278.0 - response_length_non_aborted/clip_ratio:0.0086805559694767 - response/aborted_ratio:0.0 - prompt_length/mean:245.19444274902344 - prompt_length/max:381.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.71000811457634e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8547731535509229) - timing_s/agent_loop/generate_sequences/max:np.float64(28.79870241228491) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.173812553377502) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.79870241228491) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.620614823885262 - timing_s/reward:0.00017128139734268188 - timing_s/old_log_prob:8.686519850976765 - timing_s/ref:13.18013992998749 - timing_s/adv:0.052080157212913036 - timing_s/update_actor:17.577438530512154 - timing_s/update_weights:21.861604617908597 - timing_s/step:92.42738828808069 - timing_s/stop_profile:6.468873471021652e-05 - timing_per_token_ms/adv:6.180161269078644e-05 - timing_per_token_ms/update_actor:0.02085850170762295 - timing_per_token_ms/gen:0.04365225281286969 - timing_per_token_ms/ref:0.01564038871529157 - perf/total_num_tokens:1592967 - perf/time_per_step:92.42738828808069 - perf/throughput:4308.698507835656 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:473.0 - frontier/mean_score:1.953797 - frontier/mean_frontier_pct:0.17134317602505603 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:10.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.6569999999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.6569999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.0458999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.0569999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:1.9540999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:2.2319299999999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.237 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.09 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.4659 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.343 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.0569999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.6569999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.9429999999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.51 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.6601 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:1.763 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.6569999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.343 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.637 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.49 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.763 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:1.8400999999999998 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:49.0 - cluster/prob_snapshot/cluster_0:0.02158596987025938 - cluster/prob_snapshot/cluster_1:0.012105041440228257 - cluster/prob_snapshot/cluster_2:0.012105041440228257 - cluster/prob_snapshot/cluster_3:0.018685634437936235 - cluster/prob_snapshot/cluster_4:0.016248377772118468 - cluster/prob_snapshot/cluster_5:0.018685634437936235 - cluster/prob_snapshot/cluster_6:0.012105041440228257 - cluster/prob_snapshot/cluster_7:0.02158596987025938 - cluster/prob_snapshot/cluster_8:0.014135276242854457 - cluster/prob_snapshot/cluster_9:0.01091078567397755 - cluster/prob_snapshot/cluster_10:0.018685634437936235 - cluster/prob_snapshot/cluster_11:0.016621278041988582 - cluster/prob_snapshot/cluster_12:0.014322945006122427 - cluster/prob_snapshot/cluster_13:0.01671145653862384 - cluster/prob_snapshot/cluster_14:0.015875477502248346 - cluster/prob_snapshot/cluster_15:0.01813262090046218 - cluster/prob_snapshot/cluster_16:0.018173810538114506 - cluster/prob_snapshot/cluster_17:0.010074806637602055 - cluster/prob_snapshot/cluster_18:0.015517200772373136 - cluster/prob_snapshot/cluster_19:0.016248377772118468 - cluster/prob_snapshot/cluster_20:0.016979554771863796 - cluster/prob_snapshot/cluster_21:0.020033437374133465 - cluster/prob_snapshot/cluster_22:0.013811121106300697 - cluster/prob_snapshot/cluster_23:0.013811121106300697 - cluster/prob_snapshot/cluster_24:0.02300713423209772 - cluster/prob_snapshot/cluster_25:0.012105041440228257 - cluster/prob_snapshot/cluster_26:0.01091078567397755 - cluster/prob_snapshot/cluster_27:0.01671145653862384 - cluster/prob_snapshot/cluster_28:0.019197458337757965 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02158596987025938 - cluster/prob_snapshot/cluster_31:0.013811121106300697 - cluster/prob_snapshot/cluster_32:0.01578529900561309 - cluster/prob_snapshot/cluster_33:0.013811121106300697 - cluster/prob_snapshot/cluster_34:0.012105041440228257 - cluster/prob_snapshot/cluster_35:0.020391714104008675 - cluster/prob_snapshot/cluster_36:0.015517200772373136 - cluster/prob_snapshot/cluster_37:0.016979554771863796 - cluster/prob_snapshot/cluster_38:0.013486965969746932 - cluster/prob_snapshot/cluster_39:0.014322945006122427 - cluster/prob_snapshot/cluster_40:0.028678386767789093 - cluster/prob_snapshot/cluster_41:0.022421948906634875 - cluster/prob_snapshot/cluster_42:0.012105041440228257 - cluster/prob_snapshot/cluster_43:0.02158596987025938 - cluster/prob_snapshot/cluster_44:0.012105041440228257 - cluster/prob_snapshot/cluster_45:0.01091078567397755 - cluster/prob_snapshot/cluster_46:0.013299297206478966 - cluster/prob_snapshot/cluster_47:0.022421948906634875 - cluster/prob_snapshot/cluster_48:0.012105041440228257 - cluster/prob_snapshot/cluster_49:0.013811121106300697 - cluster/prob_snapshot/cluster_50:0.012105041440228257 - cluster/prob_snapshot/cluster_51:0.010074806637602055 - cluster/prob_snapshot/cluster_52:0.012105041440228257 - cluster/prob_snapshot/cluster_53:0.013811121106300697 - cluster/prob_snapshot/cluster_54:0.014322945006122427 - cluster/prob_snapshot/cluster_55:0.01671145653862384 - cluster/prob_snapshot/cluster_56:0.014949319969237594 - cluster/prob_snapshot/cluster_57:0.018685634437936235 - cluster/prob_snapshot/cluster_58:0.018542250628286173 - cluster/prob_snapshot/cluster_59:0.016248377772118468 - cluster/prob_snapshot/cluster_60:0.013811121106300697 - cluster/prob_snapshot/cluster_61:0.013811121106300697 - cluster/prob_snapshot/cluster_62:0.02158596987025938 - cluster/prob_snapshot/cluster_63:0.01091078567397755
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_50
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 50}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▋         | 50/800 [1:30:05<33:38:23, 161.47s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:50 - global_seqlen/min:286430 - global_seqlen/max:430111 - global_seqlen/minmax_diff:143681 - global_seqlen/balanced_min:368247 - global_seqlen/balanced_max:368314 - global_seqlen/mean:368289.75 - frontier/skipped_zero_acc_count:50.0 - actor/entropy:np.float64(0.313802413833447) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010766689665615559 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.016793744420283474) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002448206341796322) - actor/ppo_kl:np.float64(-3.779626228186736e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23567014187574387) - perf/mfu/actor:np.float64(0.25034764058009623) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(113.74157905578613) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.0875) - val-aux/aime2024/reward/std@16:np.float64(0.11502559975500277) - val-aux/aime2024/reward/best@2/mean:np.float64(0.1358666666666667) - val-aux/aime2024/reward/best@2/std:np.float64(0.11883929600221009) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.037566666666666665) - val-aux/aime2024/reward/worst@2/std:np.float64(0.0806593700892105) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.0871) - val-aux/aime2024/reward/maj@2/std:np.float64(0.11503668482500415) - val-aux/aime2024/reward/best@4/mean:np.float64(0.1907) - val-aux/aime2024/reward/best@4/std:np.float64(0.09907160748357714) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.009666666666666667) - val-aux/aime2024/reward/worst@4/std:np.float64(0.03731768157193141) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.1103) - val-aux/aime2024/reward/maj@4/std:np.float64(0.11197342569728591) - val-aux/aime2024/reward/best@8/mean:np.float64(0.2320666666666667) - val-aux/aime2024/reward/best@8/std:np.float64(0.06234451490786012) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0009) - val-aux/aime2024/reward/worst@8/std:np.float64(0.007873343302638508) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.1368) - val-aux/aime2024/reward/maj@8/std:np.float64(0.10238764826640452) - val-aux/aime2024/reward/best@16/mean:np.float64(0.2534666666666667) - val-aux/aime2024/reward/best@16/std:np.float64(0.02921428347274683) - val-aux/aime2024/reward/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/reward/worst@16/std:np.float64(0.001053565375285274) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.16193333333333332) - val-aux/aime2024/reward/maj@16/std:np.float64(0.08192814761995802) - val-aux/aime2024/score/mean@16:np.float64(0.0875) - val-aux/aime2024/score/std@16:np.float64(0.11502559975500277) - val-aux/aime2024/score/best@2/mean:np.float64(0.1358666666666667) - val-aux/aime2024/score/best@2/std:np.float64(0.11883929600221009) - val-aux/aime2024/score/worst@2/mean:np.float64(0.037566666666666665) - val-aux/aime2024/score/worst@2/std:np.float64(0.0806593700892105) - val-aux/aime2024/score/maj@2/mean:np.float64(0.0871) - val-aux/aime2024/score/maj@2/std:np.float64(0.11503668482500415) - val-aux/aime2024/score/best@4/mean:np.float64(0.1907) - val-aux/aime2024/score/best@4/std:np.float64(0.09907160748357714) - val-aux/aime2024/score/worst@4/mean:np.float64(0.009666666666666667) - val-aux/aime2024/score/worst@4/std:np.float64(0.03731768157193141) - val-aux/aime2024/score/maj@4/mean:np.float64(0.1103) - val-aux/aime2024/score/maj@4/std:np.float64(0.11197342569728591) - val-aux/aime2024/score/best@8/mean:np.float64(0.2320666666666667) - val-aux/aime2024/score/best@8/std:np.float64(0.06234451490786012) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0009) - val-aux/aime2024/score/worst@8/std:np.float64(0.007873343302638508) - val-aux/aime2024/score/maj@8/mean:np.float64(0.1368) - val-aux/aime2024/score/maj@8/std:np.float64(0.10238764826640452) - val-aux/aime2024/score/best@16/mean:np.float64(0.2534666666666667) - val-aux/aime2024/score/best@16/std:np.float64(0.02921428347274683) - val-aux/aime2024/score/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/score/worst@16/std:np.float64(0.001053565375285274) - val-aux/aime2024/score/maj@16/mean:np.float64(0.16193333333333332) - val-aux/aime2024/score/maj@16/std:np.float64(0.08192814761995802) - val-core/aime2024/acc/mean@16:np.float64(0.0875) - val-aux/aime2024/acc/std@16:np.float64(0.11502559975500277) - val-aux/aime2024/acc/best@2/mean:np.float64(0.1358666666666667) - val-aux/aime2024/acc/best@2/std:np.float64(0.11883929600221009) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.037566666666666665) - val-aux/aime2024/acc/worst@2/std:np.float64(0.0806593700892105) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.0871) - val-aux/aime2024/acc/maj@2/std:np.float64(0.11503668482500415) - val-aux/aime2024/acc/best@4/mean:np.float64(0.1907) - val-aux/aime2024/acc/best@4/std:np.float64(0.09907160748357714) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.009666666666666667) - val-aux/aime2024/acc/worst@4/std:np.float64(0.03731768157193141) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.1103) - val-aux/aime2024/acc/maj@4/std:np.float64(0.11197342569728591) - val-aux/aime2024/acc/best@8/mean:np.float64(0.2320666666666667) - val-aux/aime2024/acc/best@8/std:np.float64(0.06234451490786012) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0009) - val-aux/aime2024/acc/worst@8/std:np.float64(0.007873343302638508) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.1368) - val-aux/aime2024/acc/maj@8/std:np.float64(0.10238764826640452) - val-core/aime2024/acc/best@16/mean:np.float64(0.2534666666666667) - val-core/aime2024/acc/best@16/std:np.float64(0.02921428347274683) - val-aux/aime2024/acc/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/acc/worst@16/std:np.float64(0.001053565375285274) - val-core/aime2024/acc/maj@16/mean:np.float64(0.16193333333333332) - val-core/aime2024/acc/maj@16/std:np.float64(0.08192814761995802) - val-aux/aime2025/reward/mean@16:np.float64(0.03125) - val-aux/aime2025/reward/std@16:np.float64(0.06806952844592577) - val-aux/aime2025/reward/best@2/mean:np.float64(0.05463333333333334) - val-aux/aime2025/reward/best@2/std:np.float64(0.08256691254498638) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0076) - val-aux/aime2025/reward/worst@2/std:np.float64(0.03260575969630515) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.0309) - val-aux/aime2025/reward/maj@2/std:np.float64(0.06736238320501967) - val-aux/aime2025/reward/best@4/mean:np.float64(0.08866666666666667) - val-aux/aime2025/reward/best@4/std:np.float64(0.09004603544054997) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0005333333333333334) - val-aux/aime2025/reward/worst@4/std:np.float64(0.005733473202092254) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.033733333333333324) - val-aux/aime2025/reward/maj@4/std:np.float64(0.066306466703177) - val-aux/aime2025/reward/best@8/mean:np.float64(0.13110000000000002) - val-aux/aime2025/reward/best@8/std:np.float64(0.0822673277824263) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.039299999999999995) - val-aux/aime2025/reward/maj@8/std:np.float64(0.06291732646825995) - val-aux/aime2025/reward/best@16/mean:np.float64(0.1672666666666667) - val-aux/aime2025/reward/best@16/std:np.float64(0.0595087442472739) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.045233333333333334) - val-aux/aime2025/reward/maj@16/std:np.float64(0.05445292614629577) - val-aux/aime2025/score/mean@16:np.float64(0.03125) - val-aux/aime2025/score/std@16:np.float64(0.06806952844592577) - val-aux/aime2025/score/best@2/mean:np.float64(0.05463333333333334) - val-aux/aime2025/score/best@2/std:np.float64(0.08256691254498638) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0076) - val-aux/aime2025/score/worst@2/std:np.float64(0.03260575969630515) - val-aux/aime2025/score/maj@2/mean:np.float64(0.0309) - val-aux/aime2025/score/maj@2/std:np.float64(0.06736238320501967) - val-aux/aime2025/score/best@4/mean:np.float64(0.08866666666666667) - val-aux/aime2025/score/best@4/std:np.float64(0.09004603544054997) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0005333333333333334) - val-aux/aime2025/score/worst@4/std:np.float64(0.005733473202092254) - val-aux/aime2025/score/maj@4/mean:np.float64(0.033733333333333324) - val-aux/aime2025/score/maj@4/std:np.float64(0.066306466703177) - val-aux/aime2025/score/best@8/mean:np.float64(0.13110000000000002) - val-aux/aime2025/score/best@8/std:np.float64(0.0822673277824263) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0) - val-aux/aime2025/score/worst@8/std:np.float64(0.0) - val-aux/aime2025/score/maj@8/mean:np.float64(0.039299999999999995) - val-aux/aime2025/score/maj@8/std:np.float64(0.06291732646825995) - val-aux/aime2025/score/best@16/mean:np.float64(0.1672666666666667) - val-aux/aime2025/score/best@16/std:np.float64(0.0595087442472739) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.045233333333333334) - val-aux/aime2025/score/maj@16/std:np.float64(0.05445292614629577) - val-core/aime2025/acc/mean@16:np.float64(0.03125) - val-aux/aime2025/acc/std@16:np.float64(0.06806952844592577) - val-aux/aime2025/acc/best@2/mean:np.float64(0.05463333333333334) - val-aux/aime2025/acc/best@2/std:np.float64(0.08256691254498638) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0076) - val-aux/aime2025/acc/worst@2/std:np.float64(0.03260575969630515) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.0309) - val-aux/aime2025/acc/maj@2/std:np.float64(0.06736238320501967) - val-aux/aime2025/acc/best@4/mean:np.float64(0.08866666666666667) - val-aux/aime2025/acc/best@4/std:np.float64(0.09004603544054997) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0005333333333333334) - val-aux/aime2025/acc/worst@4/std:np.float64(0.005733473202092254) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.033733333333333324) - val-aux/aime2025/acc/maj@4/std:np.float64(0.066306466703177) - val-aux/aime2025/acc/best@8/mean:np.float64(0.13110000000000002) - val-aux/aime2025/acc/best@8/std:np.float64(0.0822673277824263) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.039299999999999995) - val-aux/aime2025/acc/maj@8/std:np.float64(0.06291732646825995) - val-core/aime2025/acc/best@16/mean:np.float64(0.1672666666666667) - val-core/aime2025/acc/best@16/std:np.float64(0.0595087442472739) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.045233333333333334) - val-core/aime2025/acc/maj@16/std:np.float64(0.05445292614629577) - val-aux/math500/reward/mean@4:np.float64(0.6345) - val-aux/math500/reward/std@4:np.float64(0.14432497224277932) - val-aux/math500/reward/best@2/mean:np.float64(0.69923) - val-aux/math500/reward/best@2/std:np.float64(0.11974749352778667) - val-aux/math500/reward/worst@2/mean:np.float64(0.5698840000000001) - val-aux/math500/reward/worst@2/std:np.float64(0.12812850604348075) - val-aux/math500/reward/maj@2/mean:np.float64(0.634614) - val-aux/math500/reward/maj@2/std:np.float64(0.14418709168298102) - val-aux/math500/reward/best@4/mean:np.float64(0.747424) - val-aux/math500/reward/best@4/std:np.float64(0.07570249500748837) - val-aux/math500/reward/worst@4/mean:np.float64(0.5156620000000001) - val-aux/math500/reward/worst@4/std:np.float64(0.08983268328505803) - val-aux/math500/reward/maj@4/mean:np.float64(0.64823) - val-aux/math500/reward/maj@4/std:np.float64(0.1315503259841275) - val-aux/math500/score/mean@4:np.float64(0.6345) - val-aux/math500/score/std@4:np.float64(0.14432497224277932) - val-aux/math500/score/best@2/mean:np.float64(0.69923) - val-aux/math500/score/best@2/std:np.float64(0.11974749352778667) - val-aux/math500/score/worst@2/mean:np.float64(0.5698840000000001) - val-aux/math500/score/worst@2/std:np.float64(0.12812850604348075) - val-aux/math500/score/maj@2/mean:np.float64(0.634614) - val-aux/math500/score/maj@2/std:np.float64(0.14418709168298102) - val-aux/math500/score/best@4/mean:np.float64(0.747424) - val-aux/math500/score/best@4/std:np.float64(0.07570249500748837) - val-aux/math500/score/worst@4/mean:np.float64(0.5156620000000001) - val-aux/math500/score/worst@4/std:np.float64(0.08983268328505803) - val-aux/math500/score/maj@4/mean:np.float64(0.64823) - val-aux/math500/score/maj@4/std:np.float64(0.1315503259841275) - val-core/math500/acc/mean@4:np.float64(0.6345) - val-aux/math500/acc/std@4:np.float64(0.14432497224277932) - val-aux/math500/acc/best@2/mean:np.float64(0.69923) - val-aux/math500/acc/best@2/std:np.float64(0.11974749352778667) - val-aux/math500/acc/worst@2/mean:np.float64(0.5698840000000001) - val-aux/math500/acc/worst@2/std:np.float64(0.12812850604348075) - val-aux/math500/acc/maj@2/mean:np.float64(0.634614) - val-aux/math500/acc/maj@2/std:np.float64(0.14418709168298102) - val-core/math500/acc/best@4/mean:np.float64(0.747424) - val-core/math500/acc/best@4/std:np.float64(0.07570249500748837) - val-aux/math500/acc/worst@4/mean:np.float64(0.5156620000000001) - val-aux/math500/acc/worst@4/std:np.float64(0.08983268328505803) - val-core/math500/acc/maj@4/mean:np.float64(0.64823) - val-core/math500/acc/maj@4/std:np.float64(0.1315503259841275) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.07533783783783783 - val-aux/aime2024/response_length/clip_ratio:0.17708333333333334 - val-aux/aime2025/response_length/clip_ratio:0.13541666666666666 - val-aux/math500/response_length/clip_ratio:0.0365 - training/global_step:50 - training/epoch:0 - critic/score/mean:0.5128205418586731 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5046595335006714 - critic/rewards/max:1.0008913278579712 - critic/rewards/min:-0.041394539177417755 - critic/advantages/mean:-0.13406871259212494 - critic/advantages/max:2.4748408794403076 - critic/advantages/min:-2.474848747253418 - critic/returns/mean:-0.13406871259212494 - critic/returns/max:2.4748408794403076 - critic/returns/min:-2.474848747253418 - response_length/mean:1105.264404296875 - response_length/max:8192.0 - response_length/min:127.0 - response_length/clip_ratio:0.006410256493836641 - response_length_non_aborted/mean:1105.264404296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:127.0 - response_length_non_aborted/clip_ratio:0.006410256493836641 - response/aborted_ratio:0.0 - prompt_length/mean:235.06410217285156 - prompt_length/max:478.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.24849882721901e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0285055600106716) - timing_s/agent_loop/generate_sequences/max:np.float64(27.723613111302257) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.442085828150084) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.723613111302257) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.194683206267655 - timing_s/reward:0.00015201233327388763 - timing_s/old_log_prob:8.130036488175392 - timing_s/ref:17.226662918925285 - timing_s/adv:0.06135308649390936 - timing_s/update_actor:17.22579527180642 - timing_s/save_checkpoint:62.44815237633884 - timing_s/update_weights:24.798077325336635 - timing_s/step:160.46102427411824 - timing_s/testing:139.148019480519 - timing_s/stop_profile:0.0006520682945847511 - timing_per_token_ms/adv:7.335683163918786e-05 - timing_per_token_ms/update_actor:0.020596025983639223 - timing_per_token_ms/gen:0.043780397146911496 - timing_per_token_ms/ref:0.02059706338611167 - perf/total_num_tokens:1473159 - perf/time_per_step:160.46102427411824 - perf/throughput:2295.197551343337 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:523.0 - frontier/mean_score:1.9525809 - frontier/mean_frontier_pct:0.19088303396024625 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.3598999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.51 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.6569999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.0458999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:1.763 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.0569999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:1.9540999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:2.462350999999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.4659 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:1.763 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.4659 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.343 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.5540999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.6569999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.9429999999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.51 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.763 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.6601 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:1.763 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.343 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.637 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.763 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.1880699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:1.8976456999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:50.0 - cluster/prob_snapshot/cluster_0:0.027313462930906487 - cluster/prob_snapshot/cluster_1:0.012112580662237171 - cluster/prob_snapshot/cluster_2:0.010917581093546657 - cluster/prob_snapshot/cluster_3:0.018697272163184894 - cluster/prob_snapshot/cluster_4:0.016258497533204255 - cluster/prob_snapshot/cluster_5:0.020404414404171338 - cluster/prob_snapshot/cluster_6:0.012112580662237171 - cluster/prob_snapshot/cluster_7:0.02159941397286185 - cluster/prob_snapshot/cluster_8:0.014144079929011039 - cluster/prob_snapshot/cluster_9:0.010917581093546657 - cluster/prob_snapshot/cluster_10:0.018697272163184894 - cluster/prob_snapshot/cluster_11:0.01663163005159129 - cluster/prob_snapshot/cluster_12:0.014331865575519551 - cluster/prob_snapshot/cluster_13:0.01672186471290057 - cluster/prob_snapshot/cluster_14:0.015885365014817215 - cluster/prob_snapshot/cluster_15:0.020017063829691508 - cluster/prob_snapshot/cluster_16:0.020045914533564187 - cluster/prob_snapshot/cluster_17:0.010081081395463298 - cluster/prob_snapshot/cluster_18:0.015526865144210063 - cluster/prob_snapshot/cluster_19:0.016258497533204255 - cluster/prob_snapshot/cluster_20:0.014331865575519551 - cluster/prob_snapshot/cluster_21:0.020045914533564187 - cluster/prob_snapshot/cluster_22:0.013819722903223617 - cluster/prob_snapshot/cluster_23:0.013819722903223617 - cluster/prob_snapshot/cluster_24:0.02302146345960356 - cluster/prob_snapshot/cluster_25:0.010917581093546657 - cluster/prob_snapshot/cluster_26:0.010917581093546657 - cluster/prob_snapshot/cluster_27:0.014144079929011039 - cluster/prob_snapshot/cluster_28:0.020762914274778492 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02159941397286185 - cluster/prob_snapshot/cluster_31:0.013819722903223617 - cluster/prob_snapshot/cluster_32:0.015795130353507932 - cluster/prob_snapshot/cluster_33:0.013819722903223617 - cluster/prob_snapshot/cluster_34:0.012112580662237171 - cluster/prob_snapshot/cluster_35:0.020404414404171338 - cluster/prob_snapshot/cluster_36:0.015526865144210063 - cluster/prob_snapshot/cluster_37:0.014331865575519551 - cluster/prob_snapshot/cluster_38:0.013495365877436192 - cluster/prob_snapshot/cluster_39:0.014331865575519551 - cluster/prob_snapshot/cluster_40:0.02869624814610551 - cluster/prob_snapshot/cluster_41:0.022435913670945207 - cluster/prob_snapshot/cluster_42:0.010917581093546657 - cluster/prob_snapshot/cluster_43:0.022435913670945207 - cluster/prob_snapshot/cluster_44:0.012112580662237171 - cluster/prob_snapshot/cluster_45:0.010917581093546657 - cluster/prob_snapshot/cluster_46:0.013307580230927683 - cluster/prob_snapshot/cluster_47:0.022435913670945207 - cluster/prob_snapshot/cluster_48:0.010917581093546657 - cluster/prob_snapshot/cluster_49:0.013819722903223617 - cluster/prob_snapshot/cluster_50:0.010917581093546657 - cluster/prob_snapshot/cluster_51:0.010081081395463298 - cluster/prob_snapshot/cluster_52:0.012112580662237171 - cluster/prob_snapshot/cluster_53:0.013819722903223617 - cluster/prob_snapshot/cluster_54:0.014331865575519551 - cluster/prob_snapshot/cluster_55:0.01672186471290057 - cluster/prob_snapshot/cluster_56:0.017787365348739117 - cluster/prob_snapshot/cluster_57:0.018697272163184894 - cluster/prob_snapshot/cluster_58:0.015426433966172827 - cluster/prob_snapshot/cluster_59:0.016258497533204255 - cluster/prob_snapshot/cluster_60:0.013819722903223617 - cluster/prob_snapshot/cluster_61:0.013819722903223617 - cluster/prob_snapshot/cluster_62:0.02159941397286185 - cluster/prob_snapshot/cluster_63:0.010917581093546657
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▋         | 51/800 [1:31:45<29:44:08, 142.92s/it]
[36m(TaskRunner pid=2823680)[0m step:51 - global_seqlen/min:340202 - global_seqlen/max:451312 - global_seqlen/minmax_diff:111110 - global_seqlen/balanced_min:382725 - global_seqlen/balanced_max:382792 - global_seqlen/mean:382762.5 - frontier/skipped_zero_acc_count:45.0 - actor/entropy:np.float64(0.3120866317656778) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011454198509454727 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.021721431723563) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0001856715927195702) - actor/ppo_kl:np.float64(-2.8428411676188825e-05) - actor/pg_clipfrac_lower:np.float64(8.577986807809101e-06) - actor/grad_norm:np.float64(0.23162311180071396) - perf/mfu/actor:np.float64(0.25854864472465394) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.75292587280273) - actor/lr:np.float64(1e-06) - training/global_step:51 - training/epoch:0 - critic/score/mean:0.46084338426589966 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.45179757475852966 - critic/rewards/max:1.0014376640319824 - critic/rewards/min:-0.041585832834243774 - critic/advantages/mean:-0.10972555726766586 - critic/advantages/max:2.474851369857788 - critic/advantages/min:-2.4748618602752686 - critic/returns/mean:-0.10972555726766586 - critic/returns/max:2.474851369857788 - critic/returns/min:-2.4748618602752686 - response_length/mean:1073.8961181640625 - response_length/max:8192.0 - response_length/min:163.0 - response_length/clip_ratio:0.004518072120845318 - response_length_non_aborted/mean:1073.8961181640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:163.0 - response_length_non_aborted/clip_ratio:0.004518072120845318 - response/aborted_ratio:0.0 - prompt_length/mean:221.96385192871094 - prompt_length/max:344.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.124260395765305e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2464929763227701) - timing_s/agent_loop/generate_sequences/max:np.float64(28.817217792384326) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.852437515618476) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.817217792384326) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:196 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.81233904697001 - timing_s/reward:0.00018097646534442902 - timing_s/old_log_prob:7.598204585723579 - timing_s/ref:18.983518920838833 - timing_s/adv:0.06079643964767456 - timing_s/update_actor:17.372801923193038 - timing_s/update_weights:24.205406107008457 - timing_s/step:99.40407970361412 - timing_s/stop_profile:4.998408257961273e-05 - timing_per_token_ms/adv:7.065648090091656e-05 - timing_per_token_ms/update_actor:0.020190344276656124 - timing_per_token_ms/gen:0.043211001276135357 - timing_per_token_ms/ref:0.02206228933528909 - perf/total_num_tokens:1531050 - perf/time_per_step:99.40407970361412 - perf/throughput:3850.5713361187486 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:568.0 - frontier/mean_score:1.9308797047619044 - frontier/mean_frontier_pct:0.2154612708486149 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.2519299999999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.51 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.7598999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:1.2401 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.0458999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:1.5340999999999998 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.0569999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:1.9540999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.023645699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.4659 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:1.763 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:2.0261299999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.5540999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.6569999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.9429999999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.6569999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.763 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:1.46207 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.1340999999999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.3709999999999996 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.343 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:1.637 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.763 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.1880699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:1.8976456999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:51.0 - cluster/prob_snapshot/cluster_0:0.026732859836186156 - cluster/prob_snapshot/cluster_1:0.01224871419615963 - cluster/prob_snapshot/cluster_2:0.011040284003652605 - cluster/prob_snapshot/cluster_3:0.02063374002171857 - cluster/prob_snapshot/cluster_4:0.0164412271089391 - cluster/prob_snapshot/cluster_5:0.02063374002171857 - cluster/prob_snapshot/cluster_6:0.01224871419615963 - cluster/prob_snapshot/cluster_7:0.022688071348980505 - cluster/prob_snapshot/cluster_8:0.014303045523421565 - cluster/prob_snapshot/cluster_9:0.010194382868897688 - cluster/prob_snapshot/cluster_10:0.018907411175279963 - cluster/prob_snapshot/cluster_11:0.016818553271089248 - cluster/prob_snapshot/cluster_12:0.012611243253911735 - cluster/prob_snapshot/cluster_13:0.01690980208154386 - cluster/prob_snapshot/cluster_14:0.016063900946788945 - cluster/prob_snapshot/cluster_15:0.01663560927086401 - cluster/prob_snapshot/cluster_16:0.020271210963966464 - cluster/prob_snapshot/cluster_17:0.010194382868897688 - cluster/prob_snapshot/cluster_18:0.01570137188903684 - cluster/prob_snapshot/cluster_19:0.018907411175279963 - cluster/prob_snapshot/cluster_20:0.014492941696529815 - cluster/prob_snapshot/cluster_21:0.016656031741117388 - cluster/prob_snapshot/cluster_22:0.013975043042598233 - cluster/prob_snapshot/cluster_23:0.013975043042598233 - cluster/prob_snapshot/cluster_24:0.023280202143308948 - cluster/prob_snapshot/cluster_25:0.011040284003652605 - cluster/prob_snapshot/cluster_26:0.010194382868897688 - cluster/prob_snapshot/cluster_27:0.014303045523421565 - cluster/prob_snapshot/cluster_28:0.020996269079470675 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02184217021422559 - cluster/prob_snapshot/cluster_31:0.013975043042598233 - cluster/prob_snapshot/cluster_32:0.015972652136334333 - cluster/prob_snapshot/cluster_33:0.013975043042598233 - cluster/prob_snapshot/cluster_34:0.01224871419615963 - cluster/prob_snapshot/cluster_35:0.02184217021422559 - cluster/prob_snapshot/cluster_36:0.01570137188903684 - cluster/prob_snapshot/cluster_37:0.014492941696529815 - cluster/prob_snapshot/cluster_38:0.012019112459583293 - cluster/prob_snapshot/cluster_39:0.017543611386593463 - cluster/prob_snapshot/cluster_40:0.027711688292116846 - cluster/prob_snapshot/cluster_41:0.022688071348980505 - cluster/prob_snapshot/cluster_42:0.011040284003652605 - cluster/prob_snapshot/cluster_43:0.022688071348980505 - cluster/prob_snapshot/cluster_44:0.01224871419615963 - cluster/prob_snapshot/cluster_45:0.011040284003652605 - cluster/prob_snapshot/cluster_46:0.013457144388666653 - cluster/prob_snapshot/cluster_47:0.022688071348980505 - cluster/prob_snapshot/cluster_48:0.011040284003652605 - cluster/prob_snapshot/cluster_49:0.013975043042598233 - cluster/prob_snapshot/cluster_50:0.011040284003652605 - cluster/prob_snapshot/cluster_51:0.010194382868897688 - cluster/prob_snapshot/cluster_52:0.01224871419615963 - cluster/prob_snapshot/cluster_53:0.013975043042598233 - cluster/prob_snapshot/cluster_54:0.014492941696529815 - cluster/prob_snapshot/cluster_55:0.01690980208154386 - cluster/prob_snapshot/cluster_56:0.017987277900128187 - cluster/prob_snapshot/cluster_57:0.018907411175279963 - cluster/prob_snapshot/cluster_58:0.015599811963000853 - cluster/prob_snapshot/cluster_59:0.013975043042598233 - cluster/prob_snapshot/cluster_60:0.01224871419615963 - cluster/prob_snapshot/cluster_61:0.01224871419615963 - cluster/prob_snapshot/cluster_62:0.02184217021422559 - cluster/prob_snapshot/cluster_63:0.011040284003652605
[36m(TaskRunner pid=2823680)[0m Training Progress:   6%|▋         | 52/800 [1:33:26<27:06:16, 130.45s/it]
[36m(TaskRunner pid=2823680)[0m step:52 - global_seqlen/min:295065 - global_seqlen/max:398954 - global_seqlen/minmax_diff:103889 - global_seqlen/balanced_min:364806 - global_seqlen/balanced_max:365002 - global_seqlen/mean:364884.5 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.30770079886843993) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01064765639603138 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.024997550674015656) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00022781475541536523) - actor/ppo_kl:np.float64(-3.146615334019263e-05) - actor/pg_clipfrac_lower:np.float64(5.640737537421871e-07) - actor/grad_norm:np.float64(0.24358532374555414) - perf/mfu/actor:np.float64(0.24670567104523655) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.13887405395508) - actor/lr:np.float64(1e-06) - training/global_step:52 - training/epoch:0 - critic/score/mean:0.5308823585510254 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5229600071907043 - critic/rewards/max:1.0003855228424072 - critic/rewards/min:-0.04949693754315376 - critic/advantages/mean:-0.13345196843147278 - critic/advantages/max:2.4748425483703613 - critic/advantages/min:-2.474855661392212 - critic/returns/mean:-0.13345196843147278 - critic/returns/max:2.4748425483703613 - critic/returns/min:-2.474855661392212 - response_length/mean:1079.4822998046875 - response_length/max:8192.0 - response_length/min:226.0 - response_length/clip_ratio:0.0058823530562222 - response_length_non_aborted/mean:1079.4822998046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:226.0 - response_length_non_aborted/clip_ratio:0.0058823530562222 - response/aborted_ratio:0.0 - prompt_length/mean:235.3882293701172 - prompt_length/max:416.0 - prompt_length/min:183.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.827168494462967e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4023915333673358) - timing_s/agent_loop/generate_sequences/max:np.float64(28.283464993350208) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.451750187077778) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.283464993350208) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:206 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.95500639360398 - timing_s/reward:0.00010765157639980316 - timing_s/old_log_prob:8.55813131481409 - timing_s/ref:19.18831733521074 - timing_s/adv:0.058777871541678905 - timing_s/update_actor:17.326960598118603 - timing_s/update_weights:25.652502563782036 - timing_s/step:101.10476422403008 - timing_s/stop_profile:4.987139254808426e-05 - timing_per_token_ms/adv:6.573882415366185e-05 - timing_per_token_ms/update_actor:0.019378959904484676 - timing_per_token_ms/gen:0.04080796677275053 - timing_per_token_ms/ref:0.02146075361387694 - perf/total_num_tokens:1459538 - perf/time_per_step:101.10476422403008 - perf/throughput:3608.9743426084374 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:611.0 - frontier/mean_score:1.9286640964516126 - frontier/mean_frontier_pct:0.22149074626719018 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.1763509999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:2.0569999999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.2319299999999993 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:1.7680699999999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.0458999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.37387 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.339899999999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:1.9540999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:1.7165519899999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.4659 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:2.0261299999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.5540999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.1598999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.9429999999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.49 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.6569999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:1.763 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.3234489999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.3938699999999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.3709999999999996 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.343 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.0458999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:1.7 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.763 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.1880699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.51 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.2283519899999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:52.0 - cluster/prob_snapshot/cluster_0:0.026563188393557986 - cluster/prob_snapshot/cluster_1:0.0124605721176285 - cluster/prob_snapshot/cluster_2:0.01123124050602354 - cluster/prob_snapshot/cluster_3:0.020990628198152707 - cluster/prob_snapshot/cluster_4:0.016725600157890604 - cluster/prob_snapshot/cluster_5:0.01720227976239048 - cluster/prob_snapshot/cluster_6:0.0124605721176285 - cluster/prob_snapshot/cluster_7:0.018665184380200383 - cluster/prob_snapshot/cluster_8:0.014550435857356927 - cluster/prob_snapshot/cluster_9:0.014786015935580819 - cluster/prob_snapshot/cluster_10:0.019234440181574195 - cluster/prob_snapshot/cluster_11:0.01710945268151419 - cluster/prob_snapshot/cluster_12:0.011489400144460581 - cluster/prob_snapshot/cluster_13:0.019568115904724105 - cluster/prob_snapshot/cluster_14:0.016341747634267014 - cluster/prob_snapshot/cluster_15:0.014355181117485709 - cluster/prob_snapshot/cluster_16:0.02062182871467122 - cluster/prob_snapshot/cluster_17:0.01037070837790007 - cluster/prob_snapshot/cluster_18:0.015972948150785527 - cluster/prob_snapshot/cluster_19:0.019234440181574195 - cluster/prob_snapshot/cluster_20:0.012829371601109986 - cluster/prob_snapshot/cluster_21:0.016944120123953445 - cluster/prob_snapshot/cluster_22:0.014216760134207014 - cluster/prob_snapshot/cluster_23:0.014216760134207014 - cluster/prob_snapshot/cluster_24:0.023682864427567566 - cluster/prob_snapshot/cluster_25:0.01123124050602354 - cluster/prob_snapshot/cluster_26:0.01037070837790007 - cluster/prob_snapshot/cluster_27:0.014550435857356927 - cluster/prob_snapshot/cluster_28:0.021359427681634194 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.018062811890513952 - cluster/prob_snapshot/cluster_31:0.014216760134207014 - cluster/prob_snapshot/cluster_32:0.016248920553390722 - cluster/prob_snapshot/cluster_33:0.0124605721176285 - cluster/prob_snapshot/cluster_34:0.0124605721176285 - cluster/prob_snapshot/cluster_35:0.022219959809757665 - cluster/prob_snapshot/cluster_36:0.015972948150785527 - cluster/prob_snapshot/cluster_37:0.014743616539180568 - cluster/prob_snapshot/cluster_38:0.01106773940168008 - cluster/prob_snapshot/cluster_39:0.02001945622498479 - cluster/prob_snapshot/cluster_40:0.02819099906612461 - cluster/prob_snapshot/cluster_41:0.023080491937881135 - cluster/prob_snapshot/cluster_42:0.01123124050602354 - cluster/prob_snapshot/cluster_43:0.023080491937881135 - cluster/prob_snapshot/cluster_44:0.0124605721176285 - cluster/prob_snapshot/cluster_45:0.01123124050602354 - cluster/prob_snapshot/cluster_46:0.01710945268151419 - cluster/prob_snapshot/cluster_47:0.023080491937881135 - cluster/prob_snapshot/cluster_48:0.01123124050602354 - cluster/prob_snapshot/cluster_49:0.014216760134207014 - cluster/prob_snapshot/cluster_50:0.01123124050602354 - cluster/prob_snapshot/cluster_51:0.01037070837790007 - cluster/prob_snapshot/cluster_52:0.0124605721176285 - cluster/prob_snapshot/cluster_53:0.014216760134207014 - cluster/prob_snapshot/cluster_54:0.014743616539180568 - cluster/prob_snapshot/cluster_55:0.01720227976239048 - cluster/prob_snapshot/cluster_56:0.018298391968737844 - cluster/prob_snapshot/cluster_57:0.020990628198152707 - cluster/prob_snapshot/cluster_58:0.018635262197889916 - cluster/prob_snapshot/cluster_59:0.014216760134207014 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0124605721176285 - cluster/prob_snapshot/cluster_62:0.022219959809757665 - cluster/prob_snapshot/cluster_63:0.01123124050602354
[36m(TaskRunner pid=2823680)[0m Training Progress:   7%|▋         | 53/800 [1:35:13<25:33:07, 123.14s/it]
[36m(TaskRunner pid=2823680)[0m step:53 - global_seqlen/min:350581 - global_seqlen/max:458223 - global_seqlen/minmax_diff:107642 - global_seqlen/balanced_min:392105 - global_seqlen/balanced_max:392156 - global_seqlen/mean:392126.75 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.2903814023094518) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010672449134290218 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.038316188351018354) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021241291497495869) - actor/ppo_kl:np.float64(4.361844116558478e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22481207955967297) - perf/mfu/actor:np.float64(0.2529133596129393) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.60378646850586) - actor/lr:np.float64(1e-06) - training/global_step:53 - training/epoch:0 - critic/score/mean:0.543154776096344 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5346701145172119 - critic/rewards/max:1.0025553703308105 - critic/rewards/min:-0.061175812035799026 - critic/advantages/mean:-0.1502426266670227 - critic/advantages/max:2.474820137023926 - critic/advantages/min:-2.4748528003692627 - critic/returns/mean:-0.1502426266670227 - critic/returns/max:2.474820137023926 - critic/returns/min:-2.4748528003692627 - response_length/mean:1161.730712890625 - response_length/max:8192.0 - response_length/min:218.0 - response_length/clip_ratio:0.0074404762126505375 - response_length_non_aborted/mean:1161.730712890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:218.0 - response_length_non_aborted/clip_ratio:0.0074404762126505375 - response/aborted_ratio:0.0 - prompt_length/mean:236.0833282470703 - prompt_length/max:393.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010663457214832306 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7191586643457413) - timing_s/agent_loop/generate_sequences/max:np.float64(29.709632692858577) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.083121273301913) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.709632692858577) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:235 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.39039989002049 - timing_s/reward:0.0002221493050456047 - timing_s/old_log_prob:8.971475510858 - timing_s/ref:19.850645666010678 - timing_s/adv:0.06380562577396631 - timing_s/update_actor:18.17402007430792 - timing_s/update_weights:26.039491697214544 - timing_s/step:105.89211518596858 - timing_s/stop_profile:6.190873682498932e-05 - timing_per_token_ms/adv:6.792666884619619e-05 - timing_per_token_ms/update_actor:0.01934783380332164 - timing_per_token_ms/gen:0.041489823513539414 - timing_per_token_ms/ref:0.021132748377313937 - perf/total_num_tokens:1568507 - perf/time_per_step:105.89211518596858 - perf/throughput:3703.07788555686 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:655.0 - frontier/mean_score:1.9344274690806451 - frontier/mean_frontier_pct:0.24184919098499483 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.1234456999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:2.0569999999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.2319299999999993 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.1376489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.3321299999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.37387 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.339899999999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:1.9540999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:1.7165519899999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.4659 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.7182909999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.6878699999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.1598999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:1.343 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.7598999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.5340999999999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.3234489999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.3938699999999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.343 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:1.343 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.0458999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.343 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.09 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.763 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.1880699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.51 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.8598463929999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:53.0 - cluster/prob_snapshot/cluster_0:0.026042928596105797 - cluster/prob_snapshot/cluster_1:0.012423447479236679 - cluster/prob_snapshot/cluster_2:0.011197778499741516 - cluster/prob_snapshot/cluster_3:0.020928089377774538 - cluster/prob_snapshot/cluster_4:0.01667576842850561 - cluster/prob_snapshot/cluster_5:0.017151027828718014 - cluster/prob_snapshot/cluster_6:0.012423447479236679 - cluster/prob_snapshot/cluster_7:0.018609573914317258 - cluster/prob_snapshot/cluster_8:0.014507084744378451 - cluster/prob_snapshot/cluster_9:0.01782346985271329 - cluster/prob_snapshot/cluster_10:0.01917713369278145 - cluster/prob_snapshot/cluster_11:0.01944502991258539 - cluster/prob_snapshot/cluster_12:0.0114551689854355 - cluster/prob_snapshot/cluster_13:0.01950981527293013 - cluster/prob_snapshot/cluster_14:0.016293059543071402 - cluster/prob_snapshot/cluster_15:0.014312411740365232 - cluster/prob_snapshot/cluster_16:0.02056038868392599 - cluster/prob_snapshot/cluster_17:0.010339810214094903 - cluster/prob_snapshot/cluster_18:0.015925358849222856 - cluster/prob_snapshot/cluster_19:0.01917713369278145 - cluster/prob_snapshot/cluster_20:0.012791148173085225 - cluster/prob_snapshot/cluster_21:0.014326911404392665 - cluster/prob_snapshot/cluster_22:0.014174403164229767 - cluster/prob_snapshot/cluster_23:0.014174403164229767 - cluster/prob_snapshot/cluster_24:0.02361230444286894 - cluster/prob_snapshot/cluster_25:0.010339810214094903 - cluster/prob_snapshot/cluster_26:0.010339810214094903 - cluster/prob_snapshot/cluster_27:0.014507084744378451 - cluster/prob_snapshot/cluster_28:0.02241114884296368 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.018008996114364628 - cluster/prob_snapshot/cluster_31:0.014174403164229767 - cluster/prob_snapshot/cluster_32:0.01384172158408108 - cluster/prob_snapshot/cluster_33:0.012423447479236679 - cluster/prob_snapshot/cluster_34:0.011197778499741516 - cluster/prob_snapshot/cluster_35:0.023011726642916314 - cluster/prob_snapshot/cluster_36:0.018651846987283525 - cluster/prob_snapshot/cluster_37:0.012791148173085225 - cluster/prob_snapshot/cluster_38:0.011034764525468659 - cluster/prob_snapshot/cluster_39:0.019959810883973358 - cluster/prob_snapshot/cluster_40:0.03218173170175155 - cluster/prob_snapshot/cluster_41:0.023011726642916314 - cluster/prob_snapshot/cluster_42:0.011197778499741516 - cluster/prob_snapshot/cluster_43:0.023011726642916314 - cluster/prob_snapshot/cluster_44:0.011197778499741516 - cluster/prob_snapshot/cluster_45:0.010339810214094903 - cluster/prob_snapshot/cluster_46:0.01705847731393981 - cluster/prob_snapshot/cluster_47:0.023011726642916314 - cluster/prob_snapshot/cluster_48:0.011197778499741516 - cluster/prob_snapshot/cluster_49:0.01742617800778836 - cluster/prob_snapshot/cluster_50:0.011197778499741516 - cluster/prob_snapshot/cluster_51:0.010339810214094903 - cluster/prob_snapshot/cluster_52:0.012423447479236679 - cluster/prob_snapshot/cluster_53:0.014174403164229767 - cluster/prob_snapshot/cluster_54:0.014699689869727695 - cluster/prob_snapshot/cluster_55:0.017151027828718014 - cluster/prob_snapshot/cluster_56:0.018243874312680132 - cluster/prob_snapshot/cluster_57:0.020928089377774538 - cluster/prob_snapshot/cluster_58:0.015507183881129713 - cluster/prob_snapshot/cluster_59:0.014174403164229767 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.012423447479236679 - cluster/prob_snapshot/cluster_62:0.0221537583572697 - cluster/prob_snapshot/cluster_63:0.011197778499741516
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 13:07:37,038:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   7%|▋         | 54/800 [1:37:02<24:41:48, 119.18s/it]
[36m(TaskRunner pid=2823680)[0m step:54 - global_seqlen/min:293822 - global_seqlen/max:454982 - global_seqlen/minmax_diff:161160 - global_seqlen/balanced_min:402231 - global_seqlen/balanced_max:402325 - global_seqlen/mean:402280.0 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.27886705088118713) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010191015899181366 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.007952293381094933) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021264425296168256) - actor/ppo_kl:np.float64(3.523773171360113e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.21490169192353883) - perf/mfu/actor:np.float64(0.23879183453611053) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.35002899169922) - actor/lr:np.float64(1e-06) - training/global_step:54 - training/epoch:0 - critic/score/mean:0.48876404762268066 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.47982895374298096 - critic/rewards/max:1.0001434087753296 - critic/rewards/min:-0.06282193213701248 - critic/advantages/mean:-0.14871831238269806 - critic/advantages/max:2.4748599529266357 - critic/advantages/min:-2.474838972091675 - critic/returns/mean:-0.14871831238269806 - critic/returns/max:2.4748599529266357 - critic/returns/min:-2.474838972091675 - response_length/mean:1218.5126953125 - response_length/max:8192.0 - response_length/min:198.0 - response_length/clip_ratio:0.01123595517128706 - response_length_non_aborted/mean:1218.5126953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:198.0 - response_length_non_aborted/clip_ratio:0.01123595517128706 - response/aborted_ratio:0.0 - prompt_length/mean:235.61798095703125 - prompt_length/max:395.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.916109800338745e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.650448308326304) - timing_s/agent_loop/generate_sequences/max:np.float64(29.57452207338065) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.211685103106902) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.57452207338065) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:206 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.903165134601295 - timing_s/reward:0.00012130755931138992 - timing_s/old_log_prob:9.37320451810956 - timing_s/ref:20.563977268524468 - timing_s/adv:0.08452126663178205 - timing_s/update_actor:19.79304374754429 - timing_s/update_weights:27.584842327982187 - timing_s/step:109.7249052496627 - timing_s/stop_profile:5.9283338487148285e-05 - timing_per_token_ms/adv:8.163616299536294e-05 - timing_per_token_ms/update_actor:0.019117415177747513 - timing_per_token_ms/gen:0.036772549346517844 - timing_per_token_ms/ref:0.01986203315480066 - perf/total_num_tokens:1609120 - perf/time_per_step:109.7249052496627 - perf/throughput:3666.2597163758924 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:694.0 - frontier/mean_score:1.916338775532258 - frontier/mean_frontier_pct:0.2613723475216326 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.1234456999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:2.0569999999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:1.7398999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.1376489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.3321299999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.37387 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.339899999999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:1.6678699999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:1.7165519899999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.4659 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.7182909999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.49 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.8823509999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.4119299999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:1.8400999999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.7598999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.5340999999999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.3234489999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.3938699999999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.6017899999999994 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.2401 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:1.343 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.0458999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.09 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.2401 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.763 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.339899999999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.1880699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.0569999999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.8598463929999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.7598999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:54.0 - cluster/prob_snapshot/cluster_0:0.026288752852491073 - cluster/prob_snapshot/cluster_1:0.0125407148106374 - cluster/prob_snapshot/cluster_2:0.011303476503816128 - cluster/prob_snapshot/cluster_3:0.021125633674295218 - cluster/prob_snapshot/cluster_4:0.014308198106096362 - cluster/prob_snapshot/cluster_5:0.017312919708376596 - cluster/prob_snapshot/cluster_6:0.0125407148106374 - cluster/prob_snapshot/cluster_7:0.015674639441815682 - cluster/prob_snapshot/cluster_8:0.014644019932233562 - cluster/prob_snapshot/cluster_9:0.017991709043116928 - cluster/prob_snapshot/cluster_10:0.019358150378836255 - cluster/prob_snapshot/cluster_11:0.01962857532304147 - cluster/prob_snapshot/cluster_12:0.011563296548248593 - cluster/prob_snapshot/cluster_13:0.019693972204973452 - cluster/prob_snapshot/cluster_14:0.01403777316189114 - cluster/prob_snapshot/cluster_15:0.014447509371961136 - cluster/prob_snapshot/cluster_16:0.020754462182248835 - cluster/prob_snapshot/cluster_17:0.010437409689041235 - cluster/prob_snapshot/cluster_18:0.016075681401555325 - cluster/prob_snapshot/cluster_19:0.016075681401555325 - cluster/prob_snapshot/cluster_20:0.012911886302683782 - cluster/prob_snapshot/cluster_21:0.014462145901130838 - cluster/prob_snapshot/cluster_22:0.014308198106096362 - cluster/prob_snapshot/cluster_23:0.0125407148106374 - cluster/prob_snapshot/cluster_24:0.0242595583054735 - cluster/prob_snapshot/cluster_25:0.010437409689041235 - cluster/prob_snapshot/cluster_26:0.010437409689041235 - cluster/prob_snapshot/cluster_27:0.014644019932233562 - cluster/prob_snapshot/cluster_28:0.023410812826994105 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02030021897531588 - cluster/prob_snapshot/cluster_31:0.014308198106096362 - cluster/prob_snapshot/cluster_32:0.01397237627995916 - cluster/prob_snapshot/cluster_33:0.0125407148106374 - cluster/prob_snapshot/cluster_34:0.015487361961781127 - cluster/prob_snapshot/cluster_35:0.02322893879589138 - cluster/prob_snapshot/cluster_36:0.018827905390198567 - cluster/prob_snapshot/cluster_37:0.012911886302683782 - cluster/prob_snapshot/cluster_38:0.011138923809008897 - cluster/prob_snapshot/cluster_39:0.02014821541190641 - cluster/prob_snapshot/cluster_40:0.03031477932738636 - cluster/prob_snapshot/cluster_41:0.0238351855662338 - cluster/prob_snapshot/cluster_42:0.010437409689041235 - cluster/prob_snapshot/cluster_43:0.02322893879589138 - cluster/prob_snapshot/cluster_44:0.011303476503816128 - cluster/prob_snapshot/cluster_45:0.010437409689041235 - cluster/prob_snapshot/cluster_46:0.01721949559133091 - cluster/prob_snapshot/cluster_47:0.02322893879589138 - cluster/prob_snapshot/cluster_48:0.010437409689041235 - cluster/prob_snapshot/cluster_49:0.017590667083377292 - cluster/prob_snapshot/cluster_50:0.011303476503816128 - cluster/prob_snapshot/cluster_51:0.010437409689041235 - cluster/prob_snapshot/cluster_52:0.0125407148106374 - cluster/prob_snapshot/cluster_53:0.014308198106096362 - cluster/prob_snapshot/cluster_54:0.01483844309473405 - cluster/prob_snapshot/cluster_55:0.019693972204973452 - cluster/prob_snapshot/cluster_56:0.018416081782356628 - cluster/prob_snapshot/cluster_57:0.017312919708376596 - cluster/prob_snapshot/cluster_58:0.015653559198795734 - cluster/prob_snapshot/cluster_59:0.014308198106096362 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0125407148106374 - cluster/prob_snapshot/cluster_62:0.02322893879589138 - cluster/prob_snapshot/cluster_63:0.011303476503816128
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 13:09:27,728:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   7%|▋         | 55/800 [1:38:40<23:19:55, 112.75s/it]
[36m(TaskRunner pid=2823680)[0m step:55 - global_seqlen/min:321308 - global_seqlen/max:496717 - global_seqlen/minmax_diff:175409 - global_seqlen/balanced_min:391086 - global_seqlen/balanced_max:391157 - global_seqlen/mean:391114.0 - frontier/skipped_zero_acc_count:51.0 - actor/entropy:np.float64(0.28702347687421703) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011041730642318726 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.023116533215215895) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00020255870540262177) - actor/ppo_kl:np.float64(-1.1694566868031987e-05) - actor/pg_clipfrac_lower:np.float64(1.328825955846323e-06) - actor/grad_norm:np.float64(0.2279013380408287) - perf/mfu/actor:np.float64(0.2842579363570624) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.22940444946289) - actor/lr:np.float64(1e-06) - training/global_step:55 - training/epoch:0 - critic/score/mean:0.5503246784210205 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5410807728767395 - critic/rewards/max:1.0009194612503052 - critic/rewards/min:-0.057792745530605316 - critic/advantages/mean:-0.09143583476543427 - critic/advantages/max:2.4748446941375732 - critic/advantages/min:-2.474855422973633 - critic/returns/mean:-0.09143583476543427 - critic/returns/max:2.4748446941375732 - critic/returns/min:-2.474855422973633 - response_length/mean:1160.017822265625 - response_length/max:8192.0 - response_length/min:178.0 - response_length/clip_ratio:0.003246753243729472 - response_length_non_aborted/mean:1160.017822265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:178.0 - response_length_non_aborted/clip_ratio:0.003246753243729472 - response/aborted_ratio:0.0 - prompt_length/mean:231.9220733642578 - prompt_length/max:360.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00013755261898040771 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.436269225552678) - timing_s/agent_loop/generate_sequences/max:np.float64(28.804703714326024) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.028708505796203) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.804703714326024) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.868077053688467 - timing_s/reward:0.00017850659787654877 - timing_s/old_log_prob:7.78966165240854 - timing_s/ref:17.670913096517324 - timing_s/adv:0.06946092285215855 - timing_s/update_actor:16.125019951723516 - timing_s/update_weights:24.631486580707133 - timing_s/step:97.52206533960998 - timing_s/stop_profile:6.369967013597488e-05 - timing_per_token_ms/adv:8.101013237406748e-05 - timing_per_token_ms/update_actor:0.018806113526650434 - timing_per_token_ms/gen:0.04319805457216773 - timing_per_token_ms/ref:0.020609041031118772 - perf/total_num_tokens:1564456 - perf/time_per_step:97.52206533960998 - perf/throughput:4010.518015979133 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:745.0 - frontier/mean_score:1.9138970914999998 - frontier/mean_frontier_pct:0.2838142183203971 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.0864119899999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.49 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.1179299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.3 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.3321299999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.37387 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.5379299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:1.6678699999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:1.7165519899999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.4659 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.7182909999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:1.9429999999999998 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.8823509999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:1.5179299999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9883509999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.1880699999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.7598999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.5340999999999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.3234489999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.3938699999999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.421252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.2401 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.0458999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.09 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.7680699999999998 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.763 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.5379299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.1880699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.7398999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.8598463929999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.7598999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:55.0 - cluster/prob_snapshot/cluster_0:0.026010196039000087 - cluster/prob_snapshot/cluster_1:0.012556713822936562 - cluster/prob_snapshot/cluster_2:0.010450725377062839 - cluster/prob_snapshot/cluster_3:0.021152585030584408 - cluster/prob_snapshot/cluster_4:0.014326452012746412 - cluster/prob_snapshot/cluster_5:0.014662702268810281 - cluster/prob_snapshot/cluster_6:0.012556713822936562 - cluster/prob_snapshot/cluster_7:0.015694636607288405 - cluster/prob_snapshot/cluster_8:0.017848483830209413 - cluster/prob_snapshot/cluster_9:0.015138460986376865 - cluster/prob_snapshot/cluster_10:0.019382846840774556 - cluster/prob_snapshot/cluster_11:0.01965361678381546 - cluster/prob_snapshot/cluster_12:0.011578048603971713 - cluster/prob_snapshot/cluster_13:0.021387960209829113 - cluster/prob_snapshot/cluster_14:0.014055682069705503 - cluster/prob_snapshot/cluster_15:0.01446594100712903 - cluster/prob_snapshot/cluster_16:0.02078094001072434 - cluster/prob_snapshot/cluster_17:0.010450725377062839 - cluster/prob_snapshot/cluster_18:0.016096190202556263 - cluster/prob_snapshot/cluster_19:0.016096190202556263 - cluster/prob_snapshot/cluster_20:0.012928358842796629 - cluster/prob_snapshot/cluster_21:0.014480596209078848 - cluster/prob_snapshot/cluster_22:0.012556713822936562 - cluster/prob_snapshot/cluster_23:0.01637429191809781 - cluster/prob_snapshot/cluster_24:0.02429050781493625 - cluster/prob_snapshot/cluster_25:0.010450725377062839 - cluster/prob_snapshot/cluster_26:0.010450725377062839 - cluster/prob_snapshot/cluster_27:0.012792089002181269 - cluster/prob_snapshot/cluster_28:0.02344067953618956 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016756479521174313 - cluster/prob_snapshot/cluster_31:0.014326452012746412 - cluster/prob_snapshot/cluster_32:0.01399020175668254 - cluster/prob_snapshot/cluster_33:0.012556713822936562 - cluster/prob_snapshot/cluster_34:0.018439576385605904 - cluster/prob_snapshot/cluster_35:0.023258573476458128 - cluster/prob_snapshot/cluster_36:0.018851925383831604 - cluster/prob_snapshot/cluster_37:0.012928358842796629 - cluster/prob_snapshot/cluster_38:0.011153134464598367 - cluster/prob_snapshot/cluster_39:0.020173919811619558 - cluster/prob_snapshot/cluster_40:0.02883200995762629 - cluster/prob_snapshot/cluster_41:0.023865593675562905 - cluster/prob_snapshot/cluster_42:0.010450725377062839 - cluster/prob_snapshot/cluster_43:0.023258573476458128 - cluster/prob_snapshot/cluster_44:0.010450725377062839 - cluster/prob_snapshot/cluster_45:0.010450725377062839 - cluster/prob_snapshot/cluster_46:0.017241463631104636 - cluster/prob_snapshot/cluster_47:0.023258573476458128 - cluster/prob_snapshot/cluster_48:0.010450725377062839 - cluster/prob_snapshot/cluster_49:0.017613108650964705 - cluster/prob_snapshot/cluster_50:0.011317897090069666 - cluster/prob_snapshot/cluster_51:0.014900100005986204 - cluster/prob_snapshot/cluster_52:0.012556713822936562 - cluster/prob_snapshot/cluster_53:0.014326452012746412 - cluster/prob_snapshot/cluster_54:0.014857373469689367 - cluster/prob_snapshot/cluster_55:0.021387960209829113 - cluster/prob_snapshot/cluster_56:0.018439576385605904 - cluster/prob_snapshot/cluster_57:0.014662702268810281 - cluster/prob_snapshot/cluster_58:0.01567352947082 - cluster/prob_snapshot/cluster_59:0.014326452012746412 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.012556713822936562 - cluster/prob_snapshot/cluster_62:0.023258573476458128 - cluster/prob_snapshot/cluster_63:0.011317897090069666
[36m(TaskRunner pid=2823680)[0m Training Progress:   7%|▋         | 56/800 [1:40:28<22:59:07, 111.22s/it]
[36m(TaskRunner pid=2823680)[0m step:56 - global_seqlen/min:386632 - global_seqlen/max:416814 - global_seqlen/minmax_diff:30182 - global_seqlen/balanced_min:399985 - global_seqlen/balanced_max:400073 - global_seqlen/mean:400025.0 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.24386371972716667) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011188003234565258 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.008111897030175896) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00025449387315099807) - actor/ppo_kl:np.float64(-1.644694117204763e-05) - actor/pg_clipfrac_lower:np.float64(2.9860300607899925e-07) - actor/grad_norm:np.float64(0.21504273265600204) - perf/mfu/actor:np.float64(0.24610051960156282) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.4271469116211) - actor/lr:np.float64(1e-06) - training/global_step:56 - training/epoch:0 - critic/score/mean:0.5340909361839294 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5257818102836609 - critic/rewards/max:1.0026566982269287 - critic/rewards/min:-0.05807608366012573 - critic/advantages/mean:-0.19638213515281677 - critic/advantages/max:2.474851369857788 - critic/advantages/min:-2.4748570919036865 - critic/returns/mean:-0.19638213515281677 - critic/returns/max:2.474851369857788 - critic/returns/min:-2.4748570919036865 - response_length/mean:1134.7230224609375 - response_length/max:8192.0 - response_length/min:152.0 - response_length/clip_ratio:0.014204545877873898 - response_length_non_aborted/mean:1134.7230224609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:152.0 - response_length_non_aborted/clip_ratio:0.014204545877873898 - response/aborted_ratio:0.0 - prompt_length/mean:233.34091186523438 - prompt_length/max:555.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.635316044092178e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.260635631158948) - timing_s/agent_loop/generate_sequences/max:np.float64(29.109394373372197) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.872722858291127) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.109394373372197) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.57332145422697 - timing_s/reward:0.00014981068670749664 - timing_s/old_log_prob:9.383590656332672 - timing_s/ref:21.493405047804117 - timing_s/adv:0.06622243579477072 - timing_s/update_actor:19.248311208561063 - timing_s/update_weights:26.299258491024375 - timing_s/step:107.44746861513704 - timing_s/stop_profile:5.4406002163887024e-05 - timing_per_token_ms/adv:6.875845384804828e-05 - timing_per_token_ms/update_actor:0.019985433969664186 - timing_per_token_ms/gen:0.03827190688334654 - timing_per_token_ms/ref:0.022316504690296318 - perf/total_num_tokens:1600100 - perf/time_per_step:107.44746861513704 - perf/throughput:3722.9820781803423 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:785.0 - frontier/mean_score:1.9041218480483868 - frontier/mean_frontier_pct:0.30745379303924775 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.060488392999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.343 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.1179299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.51 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.5324909999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.37387 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.5379299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:1.6678699999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:1.5015863929999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.4659 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.7182909999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.2600999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.8823509999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.2401 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:1.5179299999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9883509999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.1880699999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:2.2319299999999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.5340999999999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.3234489999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.3938699999999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.894877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.2401 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.0458999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.2401 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.3629999999999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.7680699999999998 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:1.763 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.5379299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:1.8316489999999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.7398999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.8598463929999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.09 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.2319299999999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:56.0 - cluster/prob_snapshot/cluster_0:0.02592413719044446 - cluster/prob_snapshot/cluster_1:0.011376000094102274 - cluster/prob_snapshot/cluster_2:0.01050437655748044 - cluster/prob_snapshot/cluster_3:0.02126117664646069 - cluster/prob_snapshot/cluster_4:0.012621176574990609 - cluster/prob_snapshot/cluster_5:0.014737976592500776 - cluster/prob_snapshot/cluster_6:0.012621176574990609 - cluster/prob_snapshot/cluster_7:0.015775208601080757 - cluster/prob_snapshot/cluster_8:0.017940113089577082 - cluster/prob_snapshot/cluster_9:0.015216177725868222 - cluster/prob_snapshot/cluster_10:0.02126117664646069 - cluster/prob_snapshot/cluster_11:0.0214516886480366 - cluster/prob_snapshot/cluster_12:0.011637487155088824 - cluster/prob_snapshot/cluster_13:0.021497760177829468 - cluster/prob_snapshot/cluster_14:0.014127840116865492 - cluster/prob_snapshot/cluster_15:0.012719320140037743 - cluster/prob_snapshot/cluster_16:0.02088762370219419 - cluster/prob_snapshot/cluster_17:0.01050437655748044 - cluster/prob_snapshot/cluster_18:0.016178823663242994 - cluster/prob_snapshot/cluster_19:0.016178823663242994 - cluster/prob_snapshot/cluster_20:0.012994729519257108 - cluster/prob_snapshot/cluster_21:0.014554935649810192 - cluster/prob_snapshot/cluster_22:0.011376000094102274 - cluster/prob_snapshot/cluster_23:0.019144376628950518 - cluster/prob_snapshot/cluster_24:0.02441520867255084 - cluster/prob_snapshot/cluster_25:0.01050437655748044 - cluster/prob_snapshot/cluster_26:0.01050437655748044 - cluster/prob_snapshot/cluster_27:0.01285776010635939 - cluster/prob_snapshot/cluster_28:0.02356101760666144 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016842502727556476 - cluster/prob_snapshot/cluster_31:0.014400000119116803 - cluster/prob_snapshot/cluster_32:0.014062023645732825 - cluster/prob_snapshot/cluster_33:0.012621176574990609 - cluster/prob_snapshot/cluster_34:0.018534240153315232 - cluster/prob_snapshot/cluster_35:0.018905760156388444 - cluster/prob_snapshot/cluster_36:0.018948706039096642 - cluster/prob_snapshot/cluster_37:0.012994729519257108 - cluster/prob_snapshot/cluster_38:0.011210391622144125 - cluster/prob_snapshot/cluster_39:0.020277487226558904 - cluster/prob_snapshot/cluster_40:0.032991900414085465 - cluster/prob_snapshot/cluster_41:0.02398811313960614 - cluster/prob_snapshot/cluster_42:0.01050437655748044 - cluster/prob_snapshot/cluster_43:0.023377976663970856 - cluster/prob_snapshot/cluster_44:0.01050437655748044 - cluster/prob_snapshot/cluster_45:0.01050437655748044 - cluster/prob_snapshot/cluster_46:0.0173299766139418 - cluster/prob_snapshot/cluster_47:0.023377976663970856 - cluster/prob_snapshot/cluster_48:0.01050437655748044 - cluster/prob_snapshot/cluster_49:0.02001600016557235 - cluster/prob_snapshot/cluster_50:0.011376000094102274 - cluster/prob_snapshot/cluster_51:0.014976593065062849 - cluster/prob_snapshot/cluster_52:0.012621176574990609 - cluster/prob_snapshot/cluster_53:0.012621176574990609 - cluster/prob_snapshot/cluster_54:0.014933647182354659 - cluster/prob_snapshot/cluster_55:0.021497760177829468 - cluster/prob_snapshot/cluster_56:0.015515144598929512 - cluster/prob_snapshot/cluster_57:0.014737976592500776 - cluster/prob_snapshot/cluster_58:0.01575399310631703 - cluster/prob_snapshot/cluster_59:0.0177035295582083 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.012621176574990609 - cluster/prob_snapshot/cluster_62:0.018905760156388444 - cluster/prob_snapshot/cluster_63:0.01050437655748044
[36m(TaskRunner pid=2823680)[0m Training Progress:   7%|▋         | 57/800 [1:42:13<22:33:50, 109.33s/it]
[36m(TaskRunner pid=2823680)[0m step:57 - global_seqlen/min:326142 - global_seqlen/max:395606 - global_seqlen/minmax_diff:69464 - global_seqlen/balanced_min:370170 - global_seqlen/balanced_max:370327 - global_seqlen/mean:370222.75 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.27411229171506735) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010160648263990879 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.004076632380019873) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0001720882004836154) - actor/ppo_kl:np.float64(-1.77719884811228e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22014945124586424) - perf/mfu/actor:np.float64(0.23238702218215113) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.0353775024414) - actor/lr:np.float64(1e-06) - training/global_step:57 - training/epoch:0 - critic/score/mean:0.489130437374115 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4807074964046478 - critic/rewards/max:1.001685619354248 - critic/rewards/min:-0.05589520186185837 - critic/advantages/mean:-0.14452379941940308 - critic/advantages/max:2.474851131439209 - critic/advantages/min:-2.474841356277466 - critic/returns/mean:-0.14452379941940308 - critic/returns/max:2.474851131439209 - critic/returns/min:-2.474841356277466 - response_length/mean:1116.2391357421875 - response_length/max:8192.0 - response_length/min:180.0 - response_length/clip_ratio:0.006793478038161993 - response_length_non_aborted/mean:1116.2391357421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:180.0 - response_length_non_aborted/clip_ratio:0.006793478038161993 - response/aborted_ratio:0.0 - prompt_length/mean:232.94564819335938 - prompt_length/max:404.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.561089634895325e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5120291896164417) - timing_s/agent_loop/generate_sequences/max:np.float64(27.8835624800995) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.753876887086335) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.8835624800995) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.849003829993308 - timing_s/reward:0.0002445308491587639 - timing_s/old_log_prob:9.946983685716987 - timing_s/ref:19.466076031327248 - timing_s/adv:0.07166437525302172 - timing_s/update_actor:18.649335470981896 - timing_s/update_weights:26.332123266533017 - timing_s/step:104.71432831790298 - timing_s/stop_profile:6.327126175165176e-05 - timing_per_token_ms/adv:7.21695621883401e-05 - timing_per_token_ms/update_actor:0.01878080107853162 - timing_per_token_ms/gen:0.0363324583592923 - timing_per_token_ms/ref:0.019603299125203675 - perf/total_num_tokens:1480891 - perf/time_per_step:104.71432831790298 - perf/throughput:3535.550062222985 - frontier/active_count:59.0 - frontier/completed_count:5.0 - frontier/blacklisted_count:821.0 - frontier/mean_score:1.944569355576271 - frontier/mean_frontier_pct:0.3160873088624 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.060488392999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.343 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.6569999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.1179299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.51 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.6727436999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.37387 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.5379299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.0675089999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:1.5015863929999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.0261299999999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.7182909999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.4820699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.9176456999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9883509999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:1.7 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.49 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.4316489999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:2.2319299999999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.5340999999999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.3234489999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.3938699999999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.894877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.2823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.2401 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.0458999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.3629999999999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:1.5376489999999998 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.1340999999999997 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.676550999999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:1.8316489999999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.7398999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.8598463929999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.09 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.2319299999999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:57.0 - cluster/prob_snapshot/cluster_0:0.02667566702024172 - cluster/prob_snapshot/cluster_1:0.011705785550477871 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.023158802835159864 - cluster/prob_snapshot/cluster_4:0.012987059173650058 - cluster/prob_snapshot/cluster_5:0.015165224333042772 - cluster/prob_snapshot/cluster_6:0.012987059173650058 - cluster/prob_snapshot/cluster_7:0.016232525261145204 - cluster/prob_snapshot/cluster_8:0.01846018941989843 - cluster/prob_snapshot/cluster_9:0.0156572883160676 - cluster/prob_snapshot/cluster_10:0.02187752921198768 - cluster/prob_snapshot/cluster_11:0.023296027240201606 - cluster/prob_snapshot/cluster_12:0.01197485301134403 - cluster/prob_snapshot/cluster_13:0.02212097120039039 - cluster/prob_snapshot/cluster_14:0.01802071256715037 - cluster/prob_snapshot/cluster_15:0.013088047879354863 - cluster/prob_snapshot/cluster_16:0.017660047116448047 - cluster/prob_snapshot/cluster_17:0.01080889401425734 - cluster/prob_snapshot/cluster_18:0.01664784095414202 - cluster/prob_snapshot/cluster_19:0.01664784095414202 - cluster/prob_snapshot/cluster_20:0.013371441260601713 - cluster/prob_snapshot/cluster_21:0.014976877110436463 - cluster/prob_snapshot/cluster_22:0.011705785550477871 - cluster/prob_snapshot/cluster_23:0.02163408722358496 - cluster/prob_snapshot/cluster_24:0.02543062909640647 - cluster/prob_snapshot/cluster_25:0.01080889401425734 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01187619494235977 - cluster/prob_snapshot/cluster_28:0.024244041593986704 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01733075979529279 - cluster/prob_snapshot/cluster_31:0.014817450063896039 - cluster/prob_snapshot/cluster_32:0.014469675794749302 - cluster/prob_snapshot/cluster_33:0.012987059173650058 - cluster/prob_snapshot/cluster_34:0.021194610370836905 - cluster/prob_snapshot/cluster_35:0.019453830188889104 - cluster/prob_snapshot/cluster_36:0.019498021054667906 - cluster/prob_snapshot/cluster_37:0.013371441260601713 - cluster/prob_snapshot/cluster_38:0.01153537615859597 - cluster/prob_snapshot/cluster_39:0.020865323049681653 - cluster/prob_snapshot/cluster_40:0.033948321726036595 - cluster/prob_snapshot/cluster_41:0.019893307041637165 - cluster/prob_snapshot/cluster_42:0.01080889401425734 - cluster/prob_snapshot/cluster_43:0.024055694371380398 - cluster/prob_snapshot/cluster_44:0.01080889401425734 - cluster/prob_snapshot/cluster_45:0.01080889401425734 - cluster/prob_snapshot/cluster_46:0.01783236534454406 - cluster/prob_snapshot/cluster_47:0.024683518446734765 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.02059625558881549 - cluster/prob_snapshot/cluster_50:0.011705785550477871 - cluster/prob_snapshot/cluster_51:0.01340237486664687 - cluster/prob_snapshot/cluster_52:0.012987059173650058 - cluster/prob_snapshot/cluster_53:0.012987059173650058 - cluster/prob_snapshot/cluster_54:0.018601129518447373 - cluster/prob_snapshot/cluster_55:0.023329212227041762 - cluster/prob_snapshot/cluster_56:0.015964922112991244 - cluster/prob_snapshot/cluster_57:0.015165224333042772 - cluster/prob_snapshot/cluster_58:0.016210694738114508 - cluster/prob_snapshot/cluster_59:0.01821674743149572 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.012987059173650058 - cluster/prob_snapshot/cluster_62:0.019453830188889104 - cluster/prob_snapshot/cluster_63:0.01080889401425734
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 13:14:36,342:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   7%|▋         | 58/800 [1:43:55<22:07:12, 107.32s/it]
[36m(TaskRunner pid=2823680)[0m step:58 - global_seqlen/min:352184 - global_seqlen/max:453776 - global_seqlen/minmax_diff:101592 - global_seqlen/balanced_min:401570 - global_seqlen/balanced_max:401666 - global_seqlen/mean:401612.5 - frontier/skipped_zero_acc_count:50.0 - actor/entropy:np.float64(0.2758904009675368) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.007911617867648602 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.048845709388842806) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00019769661561533724) - actor/ppo_kl:np.float64(1.3320580076371978e-05) - actor/pg_clipfrac_lower:np.float64(2.064827220657697e-07) - actor/grad_norm:np.float64(0.19665722250938417) - perf/mfu/actor:np.float64(0.2602480722489807) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.9295539855957) - actor/lr:np.float64(1e-06) - training/global_step:58 - training/epoch:0 - critic/score/mean:0.5128205418586731 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5051523447036743 - critic/rewards/max:1.0025957822799683 - critic/rewards/min:-0.033386003226041794 - critic/advantages/mean:-0.1312291920185089 - critic/advantages/max:2.4748382568359375 - critic/advantages/min:-2.474858283996582 - critic/returns/mean:-0.1312291920185089 - critic/returns/max:2.4748382568359375 - critic/returns/min:-2.474858283996582 - response_length/mean:1245.5416259765625 - response_length/max:8192.0 - response_length/min:306.0 - response_length/clip_ratio:0.008012820966541767 - response_length_non_aborted/mean:1245.5416259765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:306.0 - response_length_non_aborted/clip_ratio:0.008012820966541767 - response/aborted_ratio:0.0 - prompt_length/mean:234.7051239013672 - prompt_length/max:340.0 - prompt_length/min:189.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.3899667263031e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.3653353359550238) - timing_s/agent_loop/generate_sequences/max:np.float64(28.986902687698603) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.360373006099508) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.986902687698603) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:215 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.7661976153031 - timing_s/reward:0.00013630464673042297 - timing_s/old_log_prob:9.16496536321938 - timing_s/ref:18.29914352297783 - timing_s/adv:0.0696701742708683 - timing_s/update_actor:18.07661055214703 - timing_s/update_weights:25.66399055160582 - timing_s/step:102.42362998891622 - timing_s/stop_profile:8.632801473140717e-05 - timing_per_token_ms/adv:7.542723327804865e-05 - timing_per_token_ms/update_actor:0.019570336019144232 - timing_per_token_ms/gen:0.03958502970248128 - timing_per_token_ms/ref:0.019811257568122333 - perf/total_num_tokens:1606450 - perf/time_per_step:102.42362998891622 - perf/throughput:3921.0922327539115 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:871.0 - frontier/mean_score:1.9610726250189652 - frontier/mean_frontier_pct:0.32825137833256274 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.060488392999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.6569999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.1179299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.51 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.7709205899999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:1.37387 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.676550999999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.3472562999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:1.3511104750999996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.3182909999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.7182909999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.4820699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.9176456999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9883509999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:1.49 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.9429999999999998 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.4316489999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:2.462350999999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.37387 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.3234489999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.894877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.2823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.2401 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.2319299999999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:1.7321299999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.5540999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:1.5376489999999998 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.1340999999999997 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.676550999999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:1.8316489999999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:1.7398999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.8598463929999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.09 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.2319299999999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:58.0 - cluster/prob_snapshot/cluster_0:0.02690723463548953 - cluster/prob_snapshot/cluster_1:0.01090272446312479 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.023359841060013355 - cluster/prob_snapshot/cluster_4:0.013099797959887055 - cluster/prob_snapshot/cluster_5:0.015296871456649316 - cluster/prob_snapshot/cluster_6:0.013099797959887055 - cluster/prob_snapshot/cluster_7:0.016373437470062827 - cluster/prob_snapshot/cluster_8:0.018620439659854753 - cluster/prob_snapshot/cluster_9:0.015793206976090157 - cluster/prob_snapshot/cluster_10:0.022067444885447318 - cluster/prob_snapshot/cluster_11:0.024361409323416797 - cluster/prob_snapshot/cluster_12:0.012078804981979885 - cluster/prob_snapshot/cluster_13:0.023531729751230634 - cluster/prob_snapshot/cluster_14:0.020636633080585255 - cluster/prob_snapshot/cluster_15:0.011878707547179198 - cluster/prob_snapshot/cluster_16:0.020381975645788263 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.016792358458647162 - cluster/prob_snapshot/cluster_19:0.016792358458647162 - cluster/prob_snapshot/cluster_20:0.013487516812256866 - cluster/prob_snapshot/cluster_21:0.01510688921898811 - cluster/prob_snapshot/cluster_22:0.011807401785321017 - cluster/prob_snapshot/cluster_23:0.02182188961227977 - cluster/prob_snapshot/cluster_24:0.0256513887171364 - cluster/prob_snapshot/cluster_25:0.01090272446312479 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.011979290476538297 - cluster/prob_snapshot/cluster_28:0.024454500619870787 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017481205619690857 - cluster/prob_snapshot/cluster_31:0.013099797959887055 - cluster/prob_snapshot/cluster_32:0.014595284961884898 - cluster/prob_snapshot/cluster_33:0.01708248821212117 - cluster/prob_snapshot/cluster_34:0.02137859772440362 - cluster/prob_snapshot/cluster_35:0.02164852389686298 - cluster/prob_snapshot/cluster_36:0.01966728056125325 - cluster/prob_snapshot/cluster_37:0.012078804981979885 - cluster/prob_snapshot/cluster_38:0.011635513094103732 - cluster/prob_snapshot/cluster_39:0.02264514597547834 - cluster/prob_snapshot/cluster_40:0.03424302220710792 - cluster/prob_snapshot/cluster_41:0.020065997968822934 - cluster/prob_snapshot/cluster_42:0.01090272446312479 - cluster/prob_snapshot/cluster_43:0.019622706080946784 - cluster/prob_snapshot/cluster_44:0.01090272446312479 - cluster/prob_snapshot/cluster_45:0.01090272446312479 - cluster/prob_snapshot/cluster_46:0.015228559087422256 - cluster/prob_snapshot/cluster_47:0.02489779250774694 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.02245516373781713 - cluster/prob_snapshot/cluster_50:0.011807401785321017 - cluster/prob_snapshot/cluster_51:0.013518718948471387 - cluster/prob_snapshot/cluster_52:0.013099797959887055 - cluster/prob_snapshot/cluster_53:0.011807401785321017 - cluster/prob_snapshot/cluster_54:0.01876260323905702 - cluster/prob_snapshot/cluster_55:0.023531729751230634 - cluster/prob_snapshot/cluster_56:0.016103511297603464 - cluster/prob_snapshot/cluster_57:0.015296871456649316 - cluster/prob_snapshot/cluster_58:0.016351417439412545 - cluster/prob_snapshot/cluster_59:0.01837488438668721 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013099797959887055 - cluster/prob_snapshot/cluster_62:0.019622706080946784 - cluster/prob_snapshot/cluster_63:0.01090272446312479
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 13:16:19,198:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   7%|▋         | 59/800 [1:45:52<22:38:12, 109.98s/it]
[36m(TaskRunner pid=2823680)[0m step:59 - global_seqlen/min:309529 - global_seqlen/max:419434 - global_seqlen/minmax_diff:109905 - global_seqlen/balanced_min:371883 - global_seqlen/balanced_max:371936 - global_seqlen/mean:371911.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.2615538348754247) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008822704665362835 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05828970979200676) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00018023728664249777) - actor/ppo_kl:np.float64(2.4637485129611984e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2090401997168859) - perf/mfu/actor:np.float64(0.21359938928660413) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.98411560058594) - actor/lr:np.float64(1e-06) - training/global_step:59 - training/epoch:0 - critic/score/mean:0.4761236011981964 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.46870777010917664 - critic/rewards/max:1.0039793252944946 - critic/rewards/min:-0.03995094820857048 - critic/advantages/mean:-0.13927552103996277 - critic/advantages/max:2.4748172760009766 - critic/advantages/min:-2.4748497009277344 - critic/returns/mean:-0.13927552103996277 - critic/returns/max:2.4748172760009766 - critic/returns/min:-2.4748497009277344 - response_length/mean:1150.882080078125 - response_length/max:8192.0 - response_length/min:214.0 - response_length/clip_ratio:0.008426966145634651 - response_length_non_aborted/mean:1150.882080078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:214.0 - response_length_non_aborted/clip_ratio:0.008426966145634651 - response/aborted_ratio:0.0 - prompt_length/mean:244.77528381347656 - prompt_length/max:641.0 - prompt_length/min:183.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.075125515460968e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.596857700496912) - timing_s/agent_loop/generate_sequences/max:np.float64(28.06232951581478) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.499046052803351) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.06232951581478) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.04639515466988 - timing_s/reward:0.00020332355052232742 - timing_s/old_log_prob:8.973611765541136 - timing_s/ref:20.248503858223557 - timing_s/adv:0.07675464544445276 - timing_s/update_actor:20.35671424958855 - timing_s/update_weights:35.764440681785345 - timing_s/step:115.9748068433255 - timing_s/stop_profile:6.180256605148315e-05 - timing_per_token_ms/adv:7.724064357381924e-05 - timing_per_token_ms/update_actor:0.020485609705857808 - timing_per_token_ms/gen:0.036667523143790404 - timing_per_token_ms/ref:0.020376714143615183 - perf/total_num_tokens:1487646 - perf/time_per_step:115.9748068433255 - perf/throughput:3206.830087696792 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:910.0 - frontier/mean_score:1.9649084628137927 - frontier/mean_frontier_pct:0.3541449047241526 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:5.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.060488392999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:1.2401 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:2.7598999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.7825509999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.51 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.7709205899999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.461709 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.676550999999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:1.9430794099999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:1.3511104750999996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.3182909999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.4820699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.9176456999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9883509999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.9429999999999998 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.2021542999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:2.023645699999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.37387 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.3234489999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.894877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.4976456999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.2401 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.2319299999999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:1.7321299999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.5540999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:1.5376489999999998 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:1.7938699999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.676550999999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:1.8316489999999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.5179299999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.6018924750999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.3629999999999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:1.8623509999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:59.0 - cluster/prob_snapshot/cluster_0:0.026854707105824716 - cluster/prob_snapshot/cluster_1:0.010881440477965321 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02421714988721594 - cluster/prob_snapshot/cluster_4:0.013074224911030021 - cluster/prob_snapshot/cluster_5:0.015267009344094716 - cluster/prob_snapshot/cluster_6:0.011784351715109609 - cluster/prob_snapshot/cluster_7:0.016341473716296417 - cluster/prob_snapshot/cluster_8:0.01564125683851105 - cluster/prob_snapshot/cluster_9:0.01576237593160798 - cluster/prob_snapshot/cluster_10:0.02202436545415124 - cluster/prob_snapshot/cluster_11:0.02431385168071409 - cluster/prob_snapshot/cluster_12:0.02160062894732 - cluster/prob_snapshot/cluster_13:0.02348579178512906 - cluster/prob_snapshot/cluster_14:0.017049837064652018 - cluster/prob_snapshot/cluster_15:0.011855518275910082 - cluster/prob_snapshot/cluster_16:0.020342186539071605 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.016759576899374053 - cluster/prob_snapshot/cluster_19:0.016759576899374053 - cluster/prob_snapshot/cluster_20:0.013461186869806142 - cluster/prob_snapshot/cluster_21:0.018451361421171878 - cluster/prob_snapshot/cluster_22:0.011784351715109609 - cluster/prob_snapshot/cluster_23:0.02177928954692636 - cluster/prob_snapshot/cluster_24:0.02560131281375813 - cluster/prob_snapshot/cluster_25:0.010881440477965321 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01195590485016702 - cluster/prob_snapshot/cluster_28:0.024406761247016236 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017447079312799627 - cluster/prob_snapshot/cluster_31:0.011784351715109609 - cluster/prob_snapshot/cluster_32:0.014566792466309353 - cluster/prob_snapshot/cluster_33:0.017049140269886798 - cluster/prob_snapshot/cluster_34:0.028097775515450934 - cluster/prob_snapshot/cluster_35:0.017756777867140115 - cluster/prob_snapshot/cluster_36:0.019628886661727622 - cluster/prob_snapshot/cluster_37:0.012055225086252895 - cluster/prob_snapshot/cluster_38:0.011612798580052193 - cluster/prob_snapshot/cluster_39:0.022600938772727664 - cluster/prob_snapshot/cluster_40:0.03417617396390628 - cluster/prob_snapshot/cluster_41:0.0219159608254141 - cluster/prob_snapshot/cluster_42:0.010881440477965321 - cluster/prob_snapshot/cluster_43:0.01958439919843975 - cluster/prob_snapshot/cluster_44:0.010881440477965321 - cluster/prob_snapshot/cluster_45:0.010881440477965321 - cluster/prob_snapshot/cluster_46:0.015198830332310353 - cluster/prob_snapshot/cluster_47:0.02529161425941764 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.022411327412927364 - cluster/prob_snapshot/cluster_50:0.011784351715109609 - cluster/prob_snapshot/cluster_51:0.01349232809410765 - cluster/prob_snapshot/cluster_52:0.013074224911030021 - cluster/prob_snapshot/cluster_53:0.011784351715109609 - cluster/prob_snapshot/cluster_54:0.015740577074596925 - cluster/prob_snapshot/cluster_55:0.02348579178512906 - cluster/prob_snapshot/cluster_56:0.01607207448594847 - cluster/prob_snapshot/cluster_57:0.013319300818254895 - cluster/prob_snapshot/cluster_58:0.014056041948150303 - cluster/prob_snapshot/cluster_59:0.020734492258230826 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013074224911030021 - cluster/prob_snapshot/cluster_62:0.016341473716296417 - cluster/prob_snapshot/cluster_63:0.010881440477965321
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 60/800 [1:47:35<22:12:12, 108.02s/it]
[36m(TaskRunner pid=2823680)[0m step:60 - global_seqlen/min:276761 - global_seqlen/max:440277 - global_seqlen/minmax_diff:163516 - global_seqlen/balanced_min:367304 - global_seqlen/balanced_max:367512 - global_seqlen/mean:367439.5 - frontier/skipped_zero_acc_count:42.0 - actor/entropy:np.float64(0.2910125064122122) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010411170311272144 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04094160085423937) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003020544538526866) - actor/ppo_kl:np.float64(-1.0387907125039758e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2312077134847641) - perf/mfu/actor:np.float64(0.24839326510517354) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.32331848144531) - actor/lr:np.float64(1e-06) - training/global_step:60 - training/epoch:0 - critic/score/mean:0.5406976938247681 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5331594944000244 - critic/rewards/max:1.0015888214111328 - critic/rewards/min:-0.037615515291690826 - critic/advantages/mean:-0.14123602211475372 - critic/advantages/max:2.4748435020446777 - critic/advantages/min:-2.4748435020446777 - critic/returns/mean:-0.14123602211475372 - critic/returns/max:2.4748435020446777 - critic/returns/min:-2.4748435020446777 - response_length/mean:1035.5203857421875 - response_length/max:8192.0 - response_length/min:207.0 - response_length/clip_ratio:0.0029069767333567142 - response_length_non_aborted/mean:1035.5203857421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:207.0 - response_length_non_aborted/clip_ratio:0.0029069767333567142 - response/aborted_ratio:0.0 - prompt_length/mean:237.89535522460938 - prompt_length/max:477.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.962955325841904e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4029186628758907) - timing_s/agent_loop/generate_sequences/max:np.float64(28.08383372798562) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.492834059317829) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.08383372798562) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.40393305197358 - timing_s/reward:0.00012145843356847763 - timing_s/old_log_prob:8.470151675865054 - timing_s/ref:19.348452313803136 - timing_s/adv:0.07281306758522987 - timing_s/update_actor:17.34361835103482 - timing_s/update_weights:27.549839194864035 - timing_s/step:102.55667030904442 - timing_s/stop_profile:5.610194057226181e-05 - timing_per_token_ms/adv:8.31095040408509e-05 - timing_per_token_ms/update_actor:0.019796165265816873 - timing_per_token_ms/gen:0.04127226937919311 - timing_per_token_ms/ref:0.022084501162871258 - perf/total_num_tokens:1469758 - perf/time_per_step:102.55667030904442 - perf/throughput:3582.794750382957 - frontier/active_count:55.0 - frontier/completed_count:9.0 - frontier/blacklisted_count:952.0 - frontier/mean_score:2.039882391838181 - frontier/mean_frontier_pct:0.36051493398842727 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:7.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.060488392999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:2.7598999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.7825509999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.51 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.8396444129999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.461709 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.676550999999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:1.9430794099999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.3182909999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.4820699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.9176456999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.8470562999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9883509999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:1.9429999999999998 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.2021542999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9165519899999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:1.8659000000000001 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.37387 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.8264142999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.6264139699999993 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.4976456999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.7680699999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.462350999999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.2401 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:1.7321299999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.6878699999999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.3763542999999998 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:1.49 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:1.7938699999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.676550999999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.1821542999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.5179299999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.6018924750999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.5540999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:1.8623509999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:60.0 - cluster/prob_snapshot/cluster_0:0.027278652794756362 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.024599457400473826 - cluster/prob_snapshot/cluster_4:0.01328062303949636 - cluster/prob_snapshot/cluster_5:0.015508024178805176 - cluster/prob_snapshot/cluster_6:0.011970387075197054 - cluster/prob_snapshot/cluster_7:0.016599450737066496 - cluster/prob_snapshot/cluster_8:0.015888179785018303 - cluster/prob_snapshot/cluster_9:0.01601121094206601 - cluster/prob_snapshot/cluster_10:0.022372056261165008 - cluster/prob_snapshot/cluster_11:0.025310232896150947 - cluster/prob_snapshot/cluster_12:0.02194163037713795 - cluster/prob_snapshot/cluster_13:0.023856553608716115 - cluster/prob_snapshot/cluster_14:0.01731899676511207 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02066332138715238 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.017024154366065802 - cluster/prob_snapshot/cluster_19:0.017024154366065802 - cluster/prob_snapshot/cluster_20:0.01367369382878615 - cluster/prob_snapshot/cluster_21:0.018742646487086033 - cluster/prob_snapshot/cluster_22:0.011053221900187541 - cluster/prob_snapshot/cluster_23:0.022123111427948136 - cluster/prob_snapshot/cluster_24:0.02600547161376341 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01214464845844886 - cluster/prob_snapshot/cluster_28:0.02537629630370688 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01772251013503733 - cluster/prob_snapshot/cluster_31:0.011970387075197054 - cluster/prob_snapshot/cluster_32:0.014796753226756984 - cluster/prob_snapshot/cluster_33:0.017318288970296258 - cluster/prob_snapshot/cluster_34:0.028541345082283447 - cluster/prob_snapshot/cluster_35:0.025995723190793928 - cluster/prob_snapshot/cluster_36:0.016631083576776014 - cluster/prob_snapshot/cluster_37:0.012245536627699909 - cluster/prob_snapshot/cluster_38:0.01627914082701048 - cluster/prob_snapshot/cluster_39:0.022957731737206798 - cluster/prob_snapshot/cluster_40:0.03232284357096205 - cluster/prob_snapshot/cluster_41:0.022261940287193967 - cluster/prob_snapshot/cluster_42:0.015759108172780086 - cluster/prob_snapshot/cluster_43:0.021947352632165698 - cluster/prob_snapshot/cluster_44:0.011053221900187541 - cluster/prob_snapshot/cluster_45:0.011053221900187541 - cluster/prob_snapshot/cluster_46:0.015438768849263642 - cluster/prob_snapshot/cluster_47:0.025690883958735145 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.023957441777967162 - cluster/prob_snapshot/cluster_50:0.011970387075197054 - cluster/prob_snapshot/cluster_51:0.012267679615496565 - cluster/prob_snapshot/cluster_52:0.01328062303949636 - cluster/prob_snapshot/cluster_53:0.011970387075197054 - cluster/prob_snapshot/cluster_54:0.01598906795426935 - cluster/prob_snapshot/cluster_55:0.023856553608716115 - cluster/prob_snapshot/cluster_56:0.0194499118606148 - cluster/prob_snapshot/cluster_57:0.013529567872713224 - cluster/prob_snapshot/cluster_58:0.01427793967222074 - cluster/prob_snapshot/cluster_59:0.022765127050454796 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01328062303949636 - cluster/prob_snapshot/cluster_62:0.016599450737066496 - cluster/prob_snapshot/cluster_63:0.011053221900187541
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 61/800 [1:49:25<22:19:20, 108.74s/it]
[36m(TaskRunner pid=2823680)[0m step:61 - global_seqlen/min:338831 - global_seqlen/max:384015 - global_seqlen/minmax_diff:45184 - global_seqlen/balanced_min:370944 - global_seqlen/balanced_max:371088 - global_seqlen/mean:370998.0 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.3149157982243567) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010425258427858353 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.029367989642196335) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002740252159907998) - actor/ppo_kl:np.float64(-6.248904641120599e-06) - actor/pg_clipfrac_lower:np.float64(1.3190384583585725e-06) - actor/grad_norm:np.float64(0.22580503500424898) - perf/mfu/actor:np.float64(0.19457278733746963) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.8044662475586) - actor/lr:np.float64(1e-06) - training/global_step:61 - training/epoch:0 - critic/score/mean:0.4247449040412903 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4164108633995056 - critic/rewards/max:1.0009456872940063 - critic/rewards/min:-0.038163650780916214 - critic/advantages/mean:-0.12115130573511124 - critic/advantages/max:2.4748332500457764 - critic/advantages/min:-2.4748427867889404 - critic/returns/mean:-0.12115130573511124 - critic/returns/max:2.4748332500457764 - critic/returns/min:-2.4748427867889404 - response_length/mean:1086.4451904296875 - response_length/max:8192.0 - response_length/min:128.0 - response_length/clip_ratio:0.005102040711790323 - response_length_non_aborted/mean:1086.4451904296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:128.0 - response_length_non_aborted/clip_ratio:0.005102040711790323 - response/aborted_ratio:0.0 - prompt_length/mean:236.70408630371094 - prompt_length/max:477.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.913502097129822e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9962428621947765) - timing_s/agent_loop/generate_sequences/max:np.float64(27.587635916657746) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.499234064731354) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.587635916657746) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.223732135258615 - timing_s/reward:0.00012552738189697266 - timing_s/old_log_prob:10.453849985264242 - timing_s/ref:21.03922671172768 - timing_s/adv:0.08365397062152624 - timing_s/update_actor:22.308718821033835 - timing_s/update_weights:26.13567782752216 - timing_s/step:109.65886088181287 - timing_s/stop_profile:6.690341979265213e-05 - timing_per_token_ms/adv:8.064206995092899e-05 - timing_per_token_ms/update_actor:0.021505509545036276 - timing_per_token_ms/gen:0.03430929618015435 - timing_per_token_ms/ref:0.020281724580375245 - perf/total_num_tokens:1483992 - perf/time_per_step:109.65886088181287 - perf/throughput:3383.20129369072 - frontier/active_count:53.0 - frontier/completed_count:11.0 - frontier/blacklisted_count:982.0 - frontier/mean_score:2.0884351471428295 - frontier/mean_frontier_pct:0.375351891572468 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.060488392999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:2.7598999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.49 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.7825509999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.2569999999999997 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.8396444129999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.0231963 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.676550999999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:1.9430794099999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.5228036999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.9176456999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.8470562999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9883509999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.2600999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.2021542999999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9165519899999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.37387 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.8264142999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.6264139699999993 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.6483519899999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.7680699999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.462350999999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.7680699999999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:1.7321299999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.6878699999999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.3763542999999998 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:1.7938699999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.676550999999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.1821542999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.5179299999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.0213247325699997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.6878699999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:61.0 - cluster/prob_snapshot/cluster_0:0.027649919652936526 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.024934259977812477 - cluster/prob_snapshot/cluster_4:0.013461374458111018 - cluster/prob_snapshot/cluster_5:0.01571909088568279 - cluster/prob_snapshot/cluster_6:0.011203658030539243 - cluster/prob_snapshot/cluster_7:0.016825371935192956 - cluster/prob_snapshot/cluster_8:0.01610442047092634 - cluster/prob_snapshot/cluster_9:0.01622912610183751 - cluster/prob_snapshot/cluster_10:0.029425299738300388 - cluster/prob_snapshot/cluster_11:0.02565470924246701 - cluster/prob_snapshot/cluster_12:0.018278525501050146 - cluster/prob_snapshot/cluster_13:0.024181245145792944 - cluster/prob_snapshot/cluster_14:0.01755471110057411 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.022792218315441586 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.01725585584898795 - cluster/prob_snapshot/cluster_19:0.01725585584898795 - cluster/prob_snapshot/cluster_20:0.013859795004153094 - cluster/prob_snapshot/cluster_21:0.01899773692456466 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.023827978928302306 - cluster/prob_snapshot/cluster_24:0.02635941027100499 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.012309939080049409 - cluster/prob_snapshot/cluster_28:0.025721671783640303 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017963716352456038 - cluster/prob_snapshot/cluster_31:0.012133305971304091 - cluster/prob_snapshot/cluster_32:0.014998139421416173 - cluster/prob_snapshot/cluster_33:0.02041882712266893 - cluster/prob_snapshot/cluster_34:0.028929797385872726 - cluster/prob_snapshot/cluster_35:0.02634952917042876 - cluster/prob_snapshot/cluster_36:0.01451054856268849 - cluster/prob_snapshot/cluster_37:0.012412200353533546 - cluster/prob_snapshot/cluster_38:0.016500702555670276 - cluster/prob_snapshot/cluster_39:0.0232701901638434 - cluster/prob_snapshot/cluster_40:0.03276276267805031 - cluster/prob_snapshot/cluster_41:0.02392648176796878 - cluster/prob_snapshot/cluster_42:0.01597359217325661 - cluster/prob_snapshot/cluster_43:0.02224605963644571 - cluster/prob_snapshot/cluster_44:0.011203658030539243 - cluster/prob_snapshot/cluster_45:0.01597359217325661 - cluster/prob_snapshot/cluster_46:0.015648892979951565 - cluster/prob_snapshot/cluster_47:0.02635941027100499 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.02428350641927708 - cluster/prob_snapshot/cluster_50:0.012133305971304091 - cluster/prob_snapshot/cluster_51:0.012434644710960582 - cluster/prob_snapshot/cluster_52:0.012133305971304091 - cluster/prob_snapshot/cluster_53:0.012133305971304091 - cluster/prob_snapshot/cluster_54:0.016206681744410475 - cluster/prob_snapshot/cluster_55:0.024181245145792944 - cluster/prob_snapshot/cluster_56:0.019714628293743035 - cluster/prob_snapshot/cluster_57:0.01371370747060433 - cluster/prob_snapshot/cluster_58:0.018261616863467032 - cluster/prob_snapshot/cluster_59:0.02428350641927708 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.012133305971304091 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.011203658030539243
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 62/800 [1:51:11<22:06:02, 107.81s/it]
[36m(TaskRunner pid=2823680)[0m step:62 - global_seqlen/min:290055 - global_seqlen/max:404042 - global_seqlen/minmax_diff:113987 - global_seqlen/balanced_min:355613 - global_seqlen/balanced_max:355691 - global_seqlen/mean:355637.0 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.32814011722803116) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011096145957708359 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.13789496215758845) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00020767751483736387) - actor/ppo_kl:np.float64(1.3388636336023172e-06) - actor/pg_clipfrac_lower:np.float64(2.5033545091940573e-07) - actor/grad_norm:np.float64(0.24558141579230627) - perf/mfu/actor:np.float64(0.2147366756522994) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.37304401397705) - actor/lr:np.float64(1e-06) - training/global_step:62 - training/epoch:0 - critic/score/mean:0.5219780206680298 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5137304663658142 - critic/rewards/max:1.0037543773651123 - critic/rewards/min:-0.03992550075054169 - critic/advantages/mean:-0.1409095823764801 - critic/advantages/max:2.474858045578003 - critic/advantages/min:-2.474862813949585 - critic/returns/mean:-0.1409095823764801 - critic/returns/max:2.474858045578003 - critic/returns/min:-2.474862813949585 - response_length/mean:1071.4945068359375 - response_length/max:8192.0 - response_length/min:141.0 - response_length/clip_ratio:0.005494505632668734 - response_length_non_aborted/mean:1071.4945068359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:141.0 - response_length_non_aborted/clip_ratio:0.005494505632668734 - response/aborted_ratio:0.0 - prompt_length/mean:240.10989379882812 - prompt_length/max:546.0 - prompt_length/min:162.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.80388543009758e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0336131611838937) - timing_s/agent_loop/generate_sequences/max:np.float64(28.4315303331241) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.347512711812669) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.4315303331241) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:241 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.62579747941345 - timing_s/reward:0.00020395498722791672 - timing_s/old_log_prob:8.952525892294943 - timing_s/ref:19.66084079630673 - timing_s/adv:0.07447374053299427 - timing_s/update_actor:19.33870125003159 - timing_s/update_weights:26.364610638469458 - timing_s/step:105.40811985731125 - timing_s/stop_profile:4.7783367335796356e-05 - timing_per_token_ms/adv:7.79953883057767e-05 - timing_per_token_ms/update_actor:0.020253172494503407 - timing_per_token_ms/gen:0.03926142683451973 - timing_per_token_ms/ref:0.020590545088125786 - perf/total_num_tokens:1422548 - perf/time_per_step:105.40811985731125 - perf/throughput:3373.905164814801 - frontier/active_count:52.0 - frontier/completed_count:12.0 - frontier/blacklisted_count:1018.0 - frontier/mean_score:2.116846036935961 - frontier/mean_frontier_pct:0.37727219432659165 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.060488392999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:2.7598999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.9429999999999998 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:1.8623509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.7825509999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.7798999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.2877510890999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.0231963 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.676550999999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:1.9430794099999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.5228036999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:1.91 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.9176456999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.8929394099999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.2600999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.1415080099999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:2.341586392999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.37387 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.1784900099999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.6264139699999993 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.7538463929999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.5376489999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.623645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:1.7680699999999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:1.5124909999999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.7815089999999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:1.7938699999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.676550999999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.1821542999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.9625509999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.0213247325699997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.6878699999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:62.0 - cluster/prob_snapshot/cluster_0:0.027803413659890676 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.025072678444212054 - cluster/prob_snapshot/cluster_4:0.01765144179756659 - cluster/prob_snapshot/cluster_5:0.015806352847959906 - cluster/prob_snapshot/cluster_6:0.01126585330579636 - cluster/prob_snapshot/cluster_7:0.01691877523578998 - cluster/prob_snapshot/cluster_8:0.01619382152737731 - cluster/prob_snapshot/cluster_9:0.016319219441203535 - cluster/prob_snapshot/cluster_10:0.0343390040404642 - cluster/prob_snapshot/cluster_11:0.020783378896844167 - cluster/prob_snapshot/cluster_12:0.01837999574601239 - cluster/prob_snapshot/cluster_13:0.024315483373504188 - cluster/prob_snapshot/cluster_14:0.01765216320826815 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.022918745587872172 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.01735164891062902 - cluster/prob_snapshot/cluster_19:0.01735164891062902 - cluster/prob_snapshot/cluster_20:0.013936735389421978 - cluster/prob_snapshot/cluster_21:0.019103199754121294 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02396025605638198 - cluster/prob_snapshot/cluster_24:0.02650574022618138 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.012378275693626426 - cluster/prob_snapshot/cluster_28:0.0262812926502839 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0153697971713913 - cluster/prob_snapshot/cluster_31:0.012200662035065327 - cluster/prob_snapshot/cluster_32:0.015081399139547242 - cluster/prob_snapshot/cluster_33:0.020532178902048503 - cluster/prob_snapshot/cluster_34:0.028539447141072687 - cluster/prob_snapshot/cluster_35:0.02127245287185454 - cluster/prob_snapshot/cluster_36:0.014591101499910256 - cluster/prob_snapshot/cluster_37:0.012481104653846015 - cluster/prob_snapshot/cluster_38:0.019790782098865284 - cluster/prob_snapshot/cluster_39:0.023399370818820608 - cluster/prob_snapshot/cluster_40:0.03294463979687976 - cluster/prob_snapshot/cluster_41:0.02501768364666916 - cluster/prob_snapshot/cluster_42:0.013968976751717172 - cluster/prob_snapshot/cluster_43:0.023834858142555756 - cluster/prob_snapshot/cluster_44:0.01126585330579636 - cluster/prob_snapshot/cluster_45:0.016062266957809344 - cluster/prob_snapshot/cluster_46:0.013740425556275493 - cluster/prob_snapshot/cluster_47:0.02650574022618138 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.025268988277358535 - cluster/prob_snapshot/cluster_50:0.012200662035065327 - cluster/prob_snapshot/cluster_51:0.01692874116896875 - cluster/prob_snapshot/cluster_52:0.012200662035065327 - cluster/prob_snapshot/cluster_53:0.012200662035065327 - cluster/prob_snapshot/cluster_54:0.016296650487596896 - cluster/prob_snapshot/cluster_55:0.024315483373504188 - cluster/prob_snapshot/cluster_56:0.019824070828491847 - cluster/prob_snapshot/cluster_57:0.017829055456127688 - cluster/prob_snapshot/cluster_58:0.018362993242893054 - cluster/prob_snapshot/cluster_59:0.024418312333723775 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.01126585330579636
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 63/800 [1:52:55<21:48:06, 106.49s/it]
[36m(TaskRunner pid=2823680)[0m step:63 - global_seqlen/min:314569 - global_seqlen/max:389330 - global_seqlen/minmax_diff:74761 - global_seqlen/balanced_min:346178 - global_seqlen/balanced_max:346360 - global_seqlen/mean:346255.75 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.30205288222607446) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012376422993838787 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.039017973773297854) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002731555753901678) - actor/ppo_kl:np.float64(1.946019430151552e-05) - actor/pg_clipfrac_lower:np.float64(4.34763719300356e-07) - actor/grad_norm:np.float64(0.23578672684155977) - perf/mfu/actor:np.float64(0.2009229033447653) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.05949401855469) - actor/lr:np.float64(1e-06) - training/global_step:63 - training/epoch:0 - critic/score/mean:0.5723039507865906 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5634761452674866 - critic/rewards/max:1.0025577545166016 - critic/rewards/min:-0.06105642393231392 - critic/advantages/mean:-0.11051591485738754 - critic/advantages/max:2.4748435020446777 - critic/advantages/min:-2.4748222827911377 - critic/returns/mean:-0.11051591485738754 - critic/returns/max:2.4748435020446777 - critic/returns/min:-2.4748222827911377 - response_length/mean:1031.5992431640625 - response_length/max:8192.0 - response_length/min:162.0 - response_length/clip_ratio:0.0036764706019312143 - response_length_non_aborted/mean:1031.5992431640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:162.0 - response_length_non_aborted/clip_ratio:0.0036764706019312143 - response/aborted_ratio:0.0 - prompt_length/mean:232.40196228027344 - prompt_length/max:728.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.622836321592331e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2602232499048114) - timing_s/agent_loop/generate_sequences/max:np.float64(27.359698234125972) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.056344987408011) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.359698234125972) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:198 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.126399967819452 - timing_s/reward:0.0001338701695203781 - timing_s/old_log_prob:8.708438493311405 - timing_s/ref:18.697598289698362 - timing_s/adv:0.0711506912484765 - timing_s/update_actor:20.174732184037566 - timing_s/update_weights:26.00615218374878 - timing_s/step:103.19546892400831 - timing_s/stop_profile:6.116926670074463e-05 - timing_per_token_ms/adv:6.898290350580653e-05 - timing_per_token_ms/update_actor:0.01956005738084453 - timing_per_token_ms/gen:0.03460075906296673 - timing_per_token_ms/ref:0.018127928147658203 - perf/total_num_tokens:1385023 - perf/time_per_step:103.19546892400831 - perf/throughput:3355.338694715151 - frontier/active_count:51.0 - frontier/completed_count:13.0 - frontier/blacklisted_count:1043.0 - frontier/mean_score:2.201007774440588 - frontier/mean_frontier_pct:0.38519919318157286 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.642341875099999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.9429999999999998 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.203645699999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:1.7825509999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.7798999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.5014257623699994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.0231963 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.773585699999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.8601555869999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.6659625899999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.237 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.5340999999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.9176456999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9250575869999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.2600999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.1415080099999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:2.341586392999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.37387 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.1784900099999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.038489778999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.7538463929999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.9763542999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.623645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.2401 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:1.5376489999999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:1.9587436999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.7815089999999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:1.7938699999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.676550999999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.1821542999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.6737856999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.7815089999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:63.0 - cluster/prob_snapshot/cluster_0:0.03244807627150137 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.030573606272309716 - cluster/prob_snapshot/cluster_4:0.01730936149254145 - cluster/prob_snapshot/cluster_5:0.015500029882075586 - cluster/prob_snapshot/cluster_6:0.011047524028255613 - cluster/prob_snapshot/cluster_7:0.019631343295308565 - cluster/prob_snapshot/cluster_8:0.01587998952027342 - cluster/prob_snapshot/cluster_9:0.016002957255471566 - cluster/prob_snapshot/cluster_10:0.03367352316297346 - cluster/prob_snapshot/cluster_11:0.022284139355439225 - cluster/prob_snapshot/cluster_12:0.018023796256856586 - cluster/prob_snapshot/cluster_13:0.02470869660928647 - cluster/prob_snapshot/cluster_14:0.025479910952287743 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.023749928047299057 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.017015378512997517 - cluster/prob_snapshot/cluster_19:0.019928482583023793 - cluster/prob_snapshot/cluster_20:0.013666645118737952 - cluster/prob_snapshot/cluster_21:0.01873298476127313 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.023495912588338625 - cluster/prob_snapshot/cluster_24:0.025992065943622824 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.012138387962441504 - cluster/prob_snapshot/cluster_28:0.026058095295874426 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.015071934539836249 - cluster/prob_snapshot/cluster_31:0.011964216409924431 - cluster/prob_snapshot/cluster_32:0.014789125586087526 - cluster/prob_snapshot/cluster_33:0.020134270668704546 - cluster/prob_snapshot/cluster_34:0.0279863601527558 - cluster/prob_snapshot/cluster_35:0.020860198323444788 - cluster/prob_snapshot/cluster_36:0.014308329785906126 - cluster/prob_snapshot/cluster_37:0.012239224124425076 - cluster/prob_snapshot/cluster_38:0.019407241940803004 - cluster/prob_snapshot/cluster_39:0.02294589715933734 - cluster/prob_snapshot/cluster_40:0.03597718963903491 - cluster/prob_snapshot/cluster_41:0.024532847509710944 - cluster/prob_snapshot/cluster_42:0.017606500780256672 - cluster/prob_snapshot/cluster_43:0.023372944853140478 - cluster/prob_snapshot/cluster_44:0.011047524028255613 - cluster/prob_snapshot/cluster_45:0.013698261651901631 - cluster/prob_snapshot/cluster_46:0.01744961542693678 - cluster/prob_snapshot/cluster_47:0.025992065943622824 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.02477928192267497 - cluster/prob_snapshot/cluster_50:0.011964216409924431 - cluster/prob_snapshot/cluster_51:0.016600666612273288 - cluster/prob_snapshot/cluster_52:0.011964216409924431 - cluster/prob_snapshot/cluster_53:0.011964216409924431 - cluster/prob_snapshot/cluster_54:0.015980825682256988 - cluster/prob_snapshot/cluster_55:0.023844255693372774 - cluster/prob_snapshot/cluster_56:0.01943988554359431 - cluster/prob_snapshot/cluster_57:0.014911045672849477 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02477928192267497 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.011047524028255613
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 64/800 [1:54:39<21:37:43, 105.79s/it]
[36m(TaskRunner pid=2823680)[0m step:64 - global_seqlen/min:307662 - global_seqlen/max:383875 - global_seqlen/minmax_diff:76213 - global_seqlen/balanced_min:350617 - global_seqlen/balanced_max:350713 - global_seqlen/mean:350658.25 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.3064775131246511) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011639486066997051 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0914269428467378) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003149093572574202) - actor/ppo_kl:np.float64(9.731440210897106e-05) - actor/pg_clipfrac_lower:np.float64(1.0455886637073684e-05) - actor/grad_norm:np.float64(0.2914106547832489) - perf/mfu/actor:np.float64(0.21049599339766845) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.44690227508545) - actor/lr:np.float64(1e-06) - training/global_step:64 - training/epoch:0 - critic/score/mean:0.582446813583374 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5740638971328735 - critic/rewards/max:1.0045232772827148 - critic/rewards/min:-0.04901837557554245 - critic/advantages/mean:-0.1593990921974182 - critic/advantages/max:2.47481107711792 - critic/advantages/min:-2.4748451709747314 - critic/returns/mean:-0.1593990921974182 - critic/returns/max:2.47481107711792 - critic/returns/min:-2.4748451709747314 - response_length/mean:1015.9268798828125 - response_length/max:8192.0 - response_length/min:203.0 - response_length/clip_ratio:0.007978723384439945 - response_length_non_aborted/mean:1015.9268798828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:203.0 - response_length_non_aborted/clip_ratio:0.007978723384439945 - response/aborted_ratio:0.0 - prompt_length/mean:233.79786682128906 - prompt_length/max:411.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.193457663059235e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.48382908385247) - timing_s/agent_loop/generate_sequences/max:np.float64(27.64795591775328) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.99513424496854) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.64795591775328) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:221 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.44250096566975 - timing_s/reward:0.0002590976655483246 - timing_s/old_log_prob:9.91827182751149 - timing_s/ref:18.404753704555333 - timing_s/adv:0.08566184714436531 - timing_s/update_actor:19.467880334705114 - timing_s/update_weights:25.20588419958949 - timing_s/step:103.93126121815294 - timing_s/stop_profile:6.456207484006882e-05 - timing_per_token_ms/adv:9.114969694854645e-05 - timing_per_token_ms/update_actor:0.020715072717827344 - timing_per_token_ms/gen:0.03984740504710188 - timing_per_token_ms/ref:0.01958383782870838 - perf/total_num_tokens:1402633 - perf/time_per_step:103.93126121815294 - perf/throughput:3373.943949972513 - frontier/active_count:50.0 - frontier/completed_count:14.0 - frontier/blacklisted_count:1077.0 - frontier/mean_score:2.2806288049153998 - frontier/mean_frontier_pct:0.40470383623959266 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.642341875099999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.2600999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.203645699999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:2.1477856999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.14593 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.5014257623699994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.0231963 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.773585699999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.8601555869999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.6659625899999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.237 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:1.9738699999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:2.9176456999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9475403108999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.2600999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.1415080099999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:2.341586392999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.261709 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.4249430069999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:3.726942845299999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:2.827692475099999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.9763542999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.736551989999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.7680699999999998 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:1.5376489999999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:2.2711205899999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.7815089999999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.343 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:1.7938699999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.773585699999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.1821542999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.6737856999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8470562999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:64.0 - cluster/prob_snapshot/cluster_0:0.03194155811107641 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03009634880172714 - cluster/prob_snapshot/cluster_4:0.019819972414001307 - cluster/prob_snapshot/cluster_5:0.015258072653033437 - cluster/prob_snapshot/cluster_6:0.010875070921907449 - cluster/prob_snapshot/cluster_7:0.01932489579409433 - cluster/prob_snapshot/cluster_8:0.018835030894733193 - cluster/prob_snapshot/cluster_9:0.015753149272940414 - cluster/prob_snapshot/cluster_10:0.036357779846192845 - cluster/prob_snapshot/cluster_11:0.02193628140606415 - cluster/prob_snapshot/cluster_12:0.01774244274771449 - cluster/prob_snapshot/cluster_13:0.024322991045470775 - cluster/prob_snapshot/cluster_14:0.025082166644879306 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02337918896976217 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.016749766519509095 - cluster/prob_snapshot/cluster_19:0.019617396703739187 - cluster/prob_snapshot/cluster_20:0.01730987520411697 - cluster/prob_snapshot/cluster_21:0.018440560738931855 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02312913872100143 - cluster/prob_snapshot/cluster_24:0.025586326838560037 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.011948906346033314 - cluster/prob_snapshot/cluster_28:0.025848487965662952 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014836659927767235 - cluster/prob_snapshot/cluster_31:0.011777453631256918 - cluster/prob_snapshot/cluster_32:0.014558265653946098 - cluster/prob_snapshot/cluster_33:0.019819972414001307 - cluster/prob_snapshot/cluster_34:0.02754948988830766 - cluster/prob_snapshot/cluster_35:0.02053456825550233 - cluster/prob_snapshot/cluster_36:0.014084975130879132 - cluster/prob_snapshot/cluster_37:0.011064571290870837 - cluster/prob_snapshot/cluster_38:0.02126556502113419 - cluster/prob_snapshot/cluster_39:0.02258770909539175 - cluster/prob_snapshot/cluster_40:0.03268346727242402 - cluster/prob_snapshot/cluster_41:0.024797481019318203 - cluster/prob_snapshot/cluster_42:0.017331661300956976 - cluster/prob_snapshot/cluster_43:0.0239982235083759 - cluster/prob_snapshot/cluster_44:0.015505109785418032 - cluster/prob_snapshot/cluster_45:0.01348443022982023 - cluster/prob_snapshot/cluster_46:0.019916617602172636 - cluster/prob_snapshot/cluster_47:0.025586326838560037 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.024392474514090687 - cluster/prob_snapshot/cluster_50:0.011777453631256918 - cluster/prob_snapshot/cluster_51:0.016341528318714053 - cluster/prob_snapshot/cluster_52:0.011777453631256918 - cluster/prob_snapshot/cluster_53:0.01613677768196266 - cluster/prob_snapshot/cluster_54:0.015731363176100403 - cluster/prob_snapshot/cluster_55:0.024322991045470775 - cluster/prob_snapshot/cluster_56:0.019136426719655916 - cluster/prob_snapshot/cluster_57:0.014678282554289573 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0249672922999463 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 65/800 [1:56:22<21:26:25, 105.01s/it]
[36m(TaskRunner pid=2823680)[0m step:65 - global_seqlen/min:339094 - global_seqlen/max:383208 - global_seqlen/minmax_diff:44114 - global_seqlen/balanced_min:351750 - global_seqlen/balanced_max:351949 - global_seqlen/mean:351842.0 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.2544532259150098) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010321598500013351 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.03381987567991018) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00028070752784969955) - actor/ppo_kl:np.float64(2.9616407442626762e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2232219527165095) - perf/mfu/actor:np.float64(0.20464765546202385) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.62559509277344) - actor/lr:np.float64(1e-06) - training/global_step:65 - training/epoch:0 - critic/score/mean:0.5828947424888611 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5751876831054688 - critic/rewards/max:1.0023185014724731 - critic/rewards/min:-0.06912975758314133 - critic/advantages/mean:-0.12849178910255432 - critic/advantages/max:2.474849224090576 - critic/advantages/min:-2.4748523235321045 - critic/returns/mean:-0.12849178910255432 - critic/returns/max:2.474849224090576 - critic/returns/min:-2.4748523235321045 - response_length/mean:1011.489501953125 - response_length/max:8192.0 - response_length/min:26.0 - response_length/clip_ratio:0.01184210553765297 - response_length_non_aborted/mean:1011.489501953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:26.0 - response_length_non_aborted/clip_ratio:0.01184210553765297 - response/aborted_ratio:0.0 - prompt_length/mean:229.46316528320312 - prompt_length/max:369.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.798204362392426e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.6120065245777369) - timing_s/agent_loop/generate_sequences/max:np.float64(27.145405190065503) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.30790843799241) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.145405190065503) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.08132061548531 - timing_s/reward:0.00013107340782880783 - timing_s/old_log_prob:9.273641963489354 - timing_s/ref:18.613844802603126 - timing_s/adv:0.07502110488712788 - timing_s/update_actor:20.132314398884773 - timing_s/update_weights:25.458018166013062 - timing_s/step:103.00448140315711 - timing_s/stop_profile:5.510076880455017e-05 - timing_per_token_ms/adv:7.954532477927385e-05 - timing_per_token_ms/update_actor:0.02134641298374845 - timing_per_token_ms/gen:0.03783024593159295 - timing_per_token_ms/ref:0.019736370617864804 - perf/total_num_tokens:1407368 - perf/time_per_step:103.00448140315711 - perf/throughput:3415.793130620198 - frontier/active_count:50.0 - frontier/completed_count:14.0 - frontier/blacklisted_count:1110.0 - frontier/mean_score:2.2781044853685795 - frontier/mean_frontier_pct:0.42693866557066573 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.449639312569999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.2600999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.203645699999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.4034499899999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.402151 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.0509980336589995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.0231963 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.773585699999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.9021089108999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.6659625899999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.237 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.681709 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:1.3625509999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9475403108999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.6601 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.2600999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.0990556069999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:2.341586392999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.424291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.261709 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.4249430069999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:3.726942845299999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:2.827692475099999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.9763542999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.736551989999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.7680699999999998 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:1.5376489999999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:2.4897844129999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.2470562999999992 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.155709 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.841509989999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.1821542999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:1.4716499899999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8929394099999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:65.0 - cluster/prob_snapshot/cluster_0:0.030285172034257016 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.030129697931258317 - cluster/prob_snapshot/cluster_4:0.019841934507532764 - cluster/prob_snapshot/cluster_5:0.01527497980162659 - cluster/prob_snapshot/cluster_6:0.01088712135869713 - cluster/prob_snapshot/cluster_7:0.01934630930366187 - cluster/prob_snapshot/cluster_8:0.021100436836294978 - cluster/prob_snapshot/cluster_9:0.015770605005497484 - cluster/prob_snapshot/cluster_10:0.03864748986074505 - cluster/prob_snapshot/cluster_11:0.01800618054906436 - cluster/prob_snapshot/cluster_12:0.01776210277442707 - cluster/prob_snapshot/cluster_13:0.02434994283900244 - cluster/prob_snapshot/cluster_14:0.02547827748498077 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.023405094956113634 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.016768326582623595 - cluster/prob_snapshot/cluster_19:0.019639134327397376 - cluster/prob_snapshot/cluster_20:0.01476410771148552 - cluster/prob_snapshot/cluster_21:0.018460994335470812 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.023154767631944507 - cluster/prob_snapshot/cluster_24:0.031099117821427533 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.011962146677214847 - cluster/prob_snapshot/cluster_28:0.02587713013016706 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01485310011780493 - cluster/prob_snapshot/cluster_31:0.011790503979300256 - cluster/prob_snapshot/cluster_32:0.014574397361158864 - cluster/prob_snapshot/cluster_33:0.019841934507532764 - cluster/prob_snapshot/cluster_34:0.02720731754758471 - cluster/prob_snapshot/cluster_35:0.02055732217761863 - cluster/prob_snapshot/cluster_36:0.012504176249576724 - cluster/prob_snapshot/cluster_37:0.011076831709023788 - cluster/prob_snapshot/cluster_38:0.021289128945353553 - cluster/prob_snapshot/cluster_39:0.022612738059582638 - cluster/prob_snapshot/cluster_40:0.032719683133383665 - cluster/prob_snapshot/cluster_41:0.024824958585185356 - cluster/prob_snapshot/cluster_42:0.017350866149409658 - cluster/prob_snapshot/cluster_43:0.024024815433847373 - cluster/prob_snapshot/cluster_44:0.015522290670648847 - cluster/prob_snapshot/cluster_45:0.013499372042641145 - cluster/prob_snapshot/cluster_46:0.02185838646990041 - cluster/prob_snapshot/cluster_47:0.02561467850784682 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.019727420883739168 - cluster/prob_snapshot/cluster_50:0.011790503979300256 - cluster/prob_snapshot/cluster_51:0.016359636021685883 - cluster/prob_snapshot/cluster_52:0.01088712135869713 - cluster/prob_snapshot/cluster_53:0.016154658505071036 - cluster/prob_snapshot/cluster_54:0.018925462057120906 - cluster/prob_snapshot/cluster_55:0.024946265706862564 - cluster/prob_snapshot/cluster_56:0.019157631390615906 - cluster/prob_snapshot/cluster_57:0.012919951647976305 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025397776340640006 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 66/800 [1:58:16<21:58:28, 107.78s/it]
[36m(TaskRunner pid=2823680)[0m step:66 - global_seqlen/min:346983 - global_seqlen/max:448402 - global_seqlen/minmax_diff:101419 - global_seqlen/balanced_min:397902 - global_seqlen/balanced_max:397970 - global_seqlen/mean:397929.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.31015583090484145) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010255996137857437 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.018643146380782127) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00037779652274265473) - actor/ppo_kl:np.float64(-9.243347201643498e-06) - actor/pg_clipfrac_lower:np.float64(4.322433192606291e-06) - actor/grad_norm:np.float64(0.24254611258705458) - perf/mfu/actor:np.float64(0.23120246995114024) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.69552230834961) - actor/lr:np.float64(1e-06) - training/global_step:66 - training/epoch:0 - critic/score/mean:0.5084269642829895 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5001869201660156 - critic/rewards/max:1.0004494190216064 - critic/rewards/min:-0.0724300891160965 - critic/advantages/mean:-0.12318573147058487 - critic/advantages/max:2.4748213291168213 - critic/advantages/min:-2.4748551845550537 - critic/returns/mean:-0.12318573147058487 - critic/returns/max:2.4748213291168213 - critic/returns/min:-2.4748551845550537 - response_length/mean:1215.8665771484375 - response_length/max:8192.0 - response_length/min:121.0 - response_length/clip_ratio:0.012640449218451977 - response_length_non_aborted/mean:1215.8665771484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:121.0 - response_length_non_aborted/clip_ratio:0.012640449218451977 - response/aborted_ratio:0.0 - prompt_length/mean:238.9213409423828 - prompt_length/max:375.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.276717901229858e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9670651089400053) - timing_s/agent_loop/generate_sequences/max:np.float64(29.872108802199364) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.025503617688628) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.872108802199364) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.432194907218218 - timing_s/reward:0.00015348941087722778 - timing_s/old_log_prob:10.597366876900196 - timing_s/ref:22.63281362131238 - timing_s/adv:0.06434565968811512 - timing_s/update_actor:20.314947571605444 - timing_s/update_weights:28.602191804908216 - timing_s/step:114.03326561208814 - timing_s/stop_profile:6.579607725143433e-05 - timing_per_token_ms/adv:6.212116296355324e-05 - timing_per_token_ms/update_actor:0.019612638596117088 - timing_per_token_ms/gen:0.03630854087194275 - timing_per_token_ms/ref:0.021850373593309557 - perf/total_num_tokens:1591718 - perf/time_per_step:114.03326561208814 - perf/throughput:3489.591373745744 - frontier/active_count:49.0 - frontier/completed_count:15.0 - frontier/blacklisted_count:1149.0 - frontier/mean_score:2.2826898129971487 - frontier/mean_frontier_pct:0.4522684659951004 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.314747518798999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.3023509999999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.4820699999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.7398999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.4425519899999992 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.5824149929999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.5574480099999999 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.402151 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:1.7356986235612997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.0231963 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.773585699999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.9021089108999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.766173812999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.237 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:1.681709 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9632782176299997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.343 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.46207 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.2600999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.0990556069999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:2.341586392999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.424291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.261709 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.4249430069999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:3.726942845299999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:2.279384732569999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.9763542999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.815586392999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.7680699999999998 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:1.3763542999999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:2.6428490890999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.2470562999999992 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.155709 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.889056992999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.1821542999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:1.4716499899999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8929394099999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:66.0 - cluster/prob_snapshot/cluster_0:0.029635173452716 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02952434359833502 - cluster/prob_snapshot/cluster_4:0.0221907021740328 - cluster/prob_snapshot/cluster_5:0.015555404445724607 - cluster/prob_snapshot/cluster_6:0.011086991811680607 - cluster/prob_snapshot/cluster_7:0.02183739530097102 - cluster/prob_snapshot/cluster_8:0.023087826692808826 - cluster/prob_snapshot/cluster_9:0.013924210413666845 - cluster/prob_snapshot/cluster_10:0.039356997089574706 - cluster/prob_snapshot/cluster_11:0.015517842453809718 - cluster/prob_snapshot/cluster_12:0.01808818709097855 - cluster/prob_snapshot/cluster_13:0.024796969554789463 - cluster/prob_snapshot/cluster_14:0.02594601865313573 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0247307043096656 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.017076166728739586 - cluster/prob_snapshot/cluster_19:0.019999677995911232 - cluster/prob_snapshot/cluster_20:0.015035153546189485 - cluster/prob_snapshot/cluster_21:0.018799909203670417 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.023579852807616482 - cluster/prob_snapshot/cluster_24:0.03167004879221071 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.026492896810414734 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.015125779713351375 - cluster/prob_snapshot/cluster_31:0.012006959118689667 - cluster/prob_snapshot/cluster_32:0.01307149271680015 - cluster/prob_snapshot/cluster_33:0.020206201268913263 - cluster/prob_snapshot/cluster_34:0.027706801176317932 - cluster/prob_snapshot/cluster_35:0.020934722333306763 - cluster/prob_snapshot/cluster_36:0.012733733291226823 - cluster/prob_snapshot/cluster_37:0.01128018494615251 - cluster/prob_snapshot/cluster_38:0.021679963924200584 - cluster/prob_snapshot/cluster_39:0.02302787242341105 - cluster/prob_snapshot/cluster_40:0.033320365138652296 - cluster/prob_snapshot/cluster_41:0.020378614519533402 - cluster/prob_snapshot/cluster_42:0.01766940080725728 - cluster/prob_snapshot/cluster_43:0.02517247261049135 - cluster/prob_snapshot/cluster_44:0.015807255553969946 - cluster/prob_snapshot/cluster_45:0.012305159950061602 - cluster/prob_snapshot/cluster_46:0.02362813177192101 - cluster/prob_snapshot/cluster_47:0.026084923784602154 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.020089585354798253 - cluster/prob_snapshot/cluster_50:0.012006959118689667 - cluster/prob_snapshot/cluster_51:0.01665997325083664 - cluster/prob_snapshot/cluster_52:0.011086991811680607 - cluster/prob_snapshot/cluster_53:0.016451232668876287 - cluster/prob_snapshot/cluster_54:0.019272903823374075 - cluster/prob_snapshot/cluster_55:0.025829329267695822 - cluster/prob_snapshot/cluster_56:0.0195093354212754 - cluster/prob_snapshot/cluster_57:0.013157141673082691 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025864039634189276 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 67/800 [2:00:03<21:51:45, 107.38s/it]
[36m(TaskRunner pid=2823680)[0m step:67 - global_seqlen/min:355579 - global_seqlen/max:425021 - global_seqlen/minmax_diff:69442 - global_seqlen/balanced_min:380759 - global_seqlen/balanced_max:380933 - global_seqlen/mean:380875.75 - frontier/skipped_zero_acc_count:42.0 - actor/entropy:np.float64(0.2742093585778114) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012189967557787895 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04263105086283758) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004073098785992506) - actor/ppo_kl:np.float64(0.00014272478993231828) - actor/pg_clipfrac_lower:np.float64(1.4666421668872043e-06) - actor/grad_norm:np.float64(0.2207853929563002) - perf/mfu/actor:np.float64(0.23657942372164484) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(114.81023788452148) - actor/lr:np.float64(1e-06) - training/global_step:67 - training/epoch:0 - critic/score/mean:0.5654069781303406 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5563554167747498 - critic/rewards/max:1.0002869367599487 - critic/rewards/min:-0.053540073335170746 - critic/advantages/mean:-0.17312483489513397 - critic/advantages/max:2.474843740463257 - critic/advantages/min:-2.4748451709747314 - critic/returns/mean:-0.17312483489513397 - critic/returns/max:2.474843740463257 - critic/returns/min:-2.4748451709747314 - response_length/mean:1136.12646484375 - response_length/max:8192.0 - response_length/min:154.0 - response_length/clip_ratio:0.011627906933426857 - response_length_non_aborted/mean:1136.12646484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:154.0 - response_length_non_aborted/clip_ratio:0.011627906933426857 - response/aborted_ratio:0.0 - prompt_length/mean:231.36045837402344 - prompt_length/max:684.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.448213338851929e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2492271782830358) - timing_s/agent_loop/generate_sequences/max:np.float64(28.301756369881332) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.739447652294075) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.301756369881332) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:212 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.866742479614913 - timing_s/reward:0.0011637751013040543 - timing_s/old_log_prob:9.822299853898585 - timing_s/ref:20.151814511977136 - timing_s/adv:0.06883542519062757 - timing_s/update_actor:18.886013643816113 - timing_s/update_weights:27.023459428921342 - timing_s/step:106.21832944545895 - timing_s/stop_profile:5.626212805509567e-05 - timing_per_token_ms/adv:7.316449520756393e-05 - timing_per_token_ms/update_actor:0.02007375782028453 - timing_per_token_ms/gen:0.038209622505600185 - timing_per_token_ms/ref:0.02141916509126202 - perf/total_num_tokens:1523503 - perf/time_per_step:106.21832944545895 - perf/throughput:3585.7817759746663 - frontier/active_count:48.0 - frontier/completed_count:16.0 - frontier/blacklisted_count:1191.0 - frontier/mean_score:2.233895565785879 - frontier/mean_frontier_pct:0.4659384118918124 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.220323263159299 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.211645699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.4820699999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.5179299999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.0097863929999993 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.5824149929999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.5574480099999999 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.402151 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.0231963 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.773585699999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:2.9314762376299996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.766173812999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.91 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.4659 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.0771962999999998 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.3742947523409996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.46207 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.2600999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.0990556069999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.1391104750999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.424291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.261709 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.4249430069999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:1.8955693127989994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:1.6834480099999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.270910475099999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:1.7680699999999998 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:1.3763542999999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:2.6428490890999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9423519899999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.2470562999999992 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.155709 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.889056992999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.1821542999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:1.4716499899999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8929394099999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:67.0 - cluster/prob_snapshot/cluster_0:0.030032768321863457 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.029951841277382242 - cluster/prob_snapshot/cluster_4:0.023147810693860828 - cluster/prob_snapshot/cluster_5:0.014156231003369029 - cluster/prob_snapshot/cluster_6:0.01156518552718369 - cluster/prob_snapshot/cluster_7:0.018743288851749296 - cluster/prob_snapshot/cluster_8:0.02408362914460589 - cluster/prob_snapshot/cluster_9:0.014524776376576919 - cluster/prob_snapshot/cluster_10:0.04105450611537554 - cluster/prob_snapshot/cluster_11:0.01972440980162888 - cluster/prob_snapshot/cluster_12:0.018868349784220296 - cluster/prob_snapshot/cluster_13:0.025866489150910114 - cluster/prob_snapshot/cluster_14:0.02733897795074701 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.025797365815484325 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.01781267991042726 - cluster/prob_snapshot/cluster_19:0.02299700910529978 - cluster/prob_snapshot/cluster_20:0.0193719543471329 - cluster/prob_snapshot/cluster_21:0.01961076922647231 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.024596876867579297 - cluster/prob_snapshot/cluster_24:0.033036011585306294 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.022142697610710676 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.015778170634519757 - cluster/prob_snapshot/cluster_31:0.01156518552718369 - cluster/prob_snapshot/cluster_32:0.013635280061067217 - cluster/prob_snapshot/cluster_33:0.021077716159977302 - cluster/prob_snapshot/cluster_34:0.0289018248963905 - cluster/prob_snapshot/cluster_35:0.029275377013835358 - cluster/prob_snapshot/cluster_36:0.013282952713247306 - cluster/prob_snapshot/cluster_37:0.01176671128644255 - cluster/prob_snapshot/cluster_38:0.02261504376163349 - cluster/prob_snapshot/cluster_39:0.024021088983982558 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.01767809917116081 - cluster/prob_snapshot/cluster_42:0.015699853690039657 - cluster/prob_snapshot/cluster_43:0.021178534763451613 - cluster/prob_snapshot/cluster_44:0.0164890392509053 - cluster/prob_snapshot/cluster_45:0.012835894549340403 - cluster/prob_snapshot/cluster_46:0.024647238154818087 - cluster/prob_snapshot/cluster_47:0.027440405330721816 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.020956070477805757 - cluster/prob_snapshot/cluster_50:0.012524831999844927 - cluster/prob_snapshot/cluster_51:0.017378535566415002 - cluster/prob_snapshot/cluster_52:0.01156518552718369 - cluster/prob_snapshot/cluster_53:0.01716079178176817 - cluster/prob_snapshot/cluster_54:0.020104164605773422 - cluster/prob_snapshot/cluster_55:0.026943375633136375 - cluster/prob_snapshot/cluster_56:0.02035079374924736 - cluster/prob_snapshot/cluster_57:0.013724623147671977 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.026979583094549887 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 13:32:27,265:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   8%|▊         | 68/800 [2:01:50<21:49:08, 107.31s/it]
[36m(TaskRunner pid=2823680)[0m step:68 - global_seqlen/min:332269 - global_seqlen/max:424356 - global_seqlen/minmax_diff:92087 - global_seqlen/balanced_min:358755 - global_seqlen/balanced_max:358938 - global_seqlen/mean:358869.75 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.26090031132102015) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010136481374502182 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06262636853352888) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003793397493791417) - actor/ppo_kl:np.float64(1.2977564666698527e-05) - actor/pg_clipfrac_lower:np.float64(6.550354537466774e-06) - actor/grad_norm:np.float64(0.22878233859172234) - perf/mfu/actor:np.float64(0.20578634218007202) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.25806045532227) - actor/lr:np.float64(1e-06) - training/global_step:68 - training/epoch:0 - critic/score/mean:0.5252525210380554 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.517605185508728 - critic/rewards/max:1.000760793685913 - critic/rewards/min:-0.03421397507190704 - critic/advantages/mean:-0.1707344353199005 - critic/advantages/max:2.474858283996582 - critic/advantages/min:-2.4748449325561523 - critic/returns/mean:-0.1707344353199005 - critic/returns/max:2.474858283996582 - critic/returns/min:-2.4748449325561523 - response_length/mean:1089.501220703125 - response_length/max:8192.0 - response_length/min:125.0 - response_length/clip_ratio:0.008838383480906487 - response_length_non_aborted/mean:1089.501220703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:125.0 - response_length_non_aborted/clip_ratio:0.008838383480906487 - response/aborted_ratio:0.0 - prompt_length/mean:233.18182373046875 - prompt_length/max:458.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.097065776586533e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0182006359100342) - timing_s/agent_loop/generate_sequences/max:np.float64(28.24383255187422) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.101678577685561) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.24383255187422) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:203 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.0756374951452 - timing_s/reward:0.00028353743255138397 - timing_s/old_log_prob:10.463471003808081 - timing_s/ref:19.16536669526249 - timing_s/adv:0.08198388759046793 - timing_s/update_actor:20.5004628803581 - timing_s/update_weights:26.208959734998643 - timing_s/step:106.92030026670545 - timing_s/stop_profile:7.148738950490952e-05 - timing_per_token_ms/adv:7.826138482143632e-05 - timing_per_token_ms/update_actor:0.019569633273694806 - timing_per_token_ms/gen:0.03485474599181258 - timing_per_token_ms/ref:0.018295157527468454 - perf/total_num_tokens:1435479 - perf/time_per_step:106.92030026670545 - perf/throughput:3356.422953403832 - frontier/active_count:47.0 - frontier/completed_count:17.0 - frontier/blacklisted_count:1220.0 - frontier/mean_score:2.248676909987334 - frontier/mean_frontier_pct:0.47924000717919235 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.154226284211509 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.211645699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.5179299999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:1.7068504750999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:2.1076904950999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9902136069999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.402151 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:1.71623741 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.773585699999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:2.9314762376299996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.766173812999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.237 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.4659 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.3540374099999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.3742947523409996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.46207 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.2600999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.0990556069999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.1391104750999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.424291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.261709 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.4249430069999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.575709 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.2268985189592994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:1.6834480099999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.270910475099999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.1376489999999997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:1.2634480099999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:2.7499943623699994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9423519899999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.4729394099999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.155709 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.889056992999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.4275080099999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:1.3301549929999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9250575869999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:68.0 - cluster/prob_snapshot/cluster_0:0.029844748811332394 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.030388041488106636 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01436239365258805 - cluster/prob_snapshot/cluster_6:0.011733613782305143 - cluster/prob_snapshot/cluster_7:0.016149926827648927 - cluster/prob_snapshot/cluster_8:0.019942606436689707 - cluster/prob_snapshot/cluster_9:0.018831060244195168 - cluster/prob_snapshot/cluster_10:0.04165239871412658 - cluster/prob_snapshot/cluster_11:0.02001166398517658 - cluster/prob_snapshot/cluster_12:0.01623874439777734 - cluster/prob_snapshot/cluster_13:0.026243192803745218 - cluster/prob_snapshot/cluster_14:0.02773712602560712 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.026173062798539117 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.021166110822527704 - cluster/prob_snapshot/cluster_19:0.023331923414068423 - cluster/prob_snapshot/cluster_20:0.022273498748518587 - cluster/prob_snapshot/cluster_21:0.01989636841851645 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.024955090667306595 - cluster/prob_snapshot/cluster_24:0.03351712775714866 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.02246517025185318 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016007954215832343 - cluster/prob_snapshot/cluster_31:0.011733613782305143 - cluster/prob_snapshot/cluster_32:0.01383385590089096 - cluster/prob_snapshot/cluster_33:0.021384679065710708 - cluster/prob_snapshot/cluster_34:0.02932273331378536 - cluster/prob_snapshot/cluster_35:0.029701725614718004 - cluster/prob_snapshot/cluster_36:0.013476397474085295 - cluster/prob_snapshot/cluster_37:0.011938074438882703 - cluster/prob_snapshot/cluster_38:0.02294439536185765 - cluster/prob_snapshot/cluster_39:0.024370917362799286 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021070532339211148 - cluster/prob_snapshot/cluster_42:0.015928496711499204 - cluster/prob_snapshot/cluster_43:0.021486965929372204 - cluster/prob_snapshot/cluster_44:0.02022606867843787 - cluster/prob_snapshot/cluster_45:0.011954528653626324 - cluster/prob_snapshot/cluster_46:0.02601997560806876 - cluster/prob_snapshot/cluster_47:0.027840030531615968 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.023398529105702396 - cluster/prob_snapshot/cluster_50:0.012707235956483999 - cluster/prob_snapshot/cluster_51:0.01763162587915901 - cluster/prob_snapshot/cluster_52:0.011733613782305143 - cluster/prob_snapshot/cluster_53:0.01741071100783783 - cluster/prob_snapshot/cluster_54:0.020396949304926405 - cluster/prob_snapshot/cluster_55:0.027335762398943506 - cluster/prob_snapshot/cluster_56:0.02296866498088229 - cluster/prob_snapshot/cluster_57:0.01258569870048125 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.027676393852801726 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 13:34:16,652:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   9%|▊         | 69/800 [2:03:42<22:03:48, 108.66s/it]
[36m(TaskRunner pid=2823680)[0m step:69 - global_seqlen/min:329574 - global_seqlen/max:475486 - global_seqlen/minmax_diff:145912 - global_seqlen/balanced_min:386473 - global_seqlen/balanced_max:386526 - global_seqlen/mean:386496.0 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.3001989643184506) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011371418833732605 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.041978867157013156) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002999964207428631) - actor/ppo_kl:np.float64(4.5828896645616624e-05) - actor/pg_clipfrac_lower:np.float64(3.8717820532429885e-07) - actor/grad_norm:np.float64(0.24027815346534437) - perf/mfu/actor:np.float64(0.21654493142296619) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.77498626708984) - actor/lr:np.float64(1e-06) - training/global_step:69 - training/epoch:0 - critic/score/mean:0.5528350472450256 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.544256329536438 - critic/rewards/max:1.0029041767120361 - critic/rewards/min:-0.07183472067117691 - critic/advantages/mean:-0.1984744518995285 - critic/advantages/max:2.4748282432556152 - critic/advantages/min:-2.4748482704162598 - critic/returns/mean:-0.1984744518995285 - critic/returns/max:2.4748282432556152 - critic/returns/min:-2.4748482704162598 - response_length/mean:1133.980712890625 - response_length/max:8192.0 - response_length/min:178.0 - response_length/clip_ratio:0.005154639016836882 - response_length_non_aborted/mean:1133.980712890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:178.0 - response_length_non_aborted/clip_ratio:0.005154639016836882 - response/aborted_ratio:0.0 - prompt_length/mean:233.79380798339844 - prompt_length/max:373.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.388608694076538e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7559268958866596) - timing_s/agent_loop/generate_sequences/max:np.float64(29.44971668254584) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.01015685981838) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.44971668254584) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.904812975786626 - timing_s/reward:0.00012798607349395752 - timing_s/old_log_prob:10.117745124734938 - timing_s/ref:19.14365276042372 - timing_s/adv:0.09349931869655848 - timing_s/update_actor:21.012981683947146 - timing_s/update_weights:27.886715582571924 - timing_s/step:111.5782964732498 - timing_s/stop_profile:5.7641416788101196e-05 - timing_per_token_ms/adv:8.809113937679867e-05 - timing_per_token_ms/update_actor:0.01979755065649307 - timing_per_token_ms/gen:0.03739315018572998 - timing_per_token_ms/ref:0.018036347291176518 - perf/total_num_tokens:1545984 - perf/time_per_step:111.5782964732498 - perf/throughput:3463.8994519212797 - frontier/active_count:47.0 - frontier/completed_count:17.0 - frontier/blacklisted_count:1250.0 - frontier/mean_score:2.2974602186942112 - frontier/mean_frontier_pct:0.4920112188129227 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.107958398948056 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.211645699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.9625509999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:1.7068504750999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:2.1076904950999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9902136069999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.402151 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:1.71623741 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.773585699999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:2.9314762376299996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.236321669099999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.4659 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.4659 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.5478261869999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6374489999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.5620063266386994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.3234489999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.4820699999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.0990556069999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.1391104750999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.424291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:3.1974601048999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.7029962999999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.2268985189592994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:1.4784136069999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.270910475099999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.3963542999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:1.2634480099999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.8249960536589995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9423519899999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6310575869999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.155709 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.889056992999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.4275080099999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:1.3301549929999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9475403108999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:69.0 - cluster/prob_snapshot/cluster_0:0.028782554713085415 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.029742794533730965 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01817502819659349 - cluster/prob_snapshot/cluster_6:0.011484467138227539 - cluster/prob_snapshot/cluster_7:0.015807006040766073 - cluster/prob_snapshot/cluster_8:0.019519153478373096 - cluster/prob_snapshot/cluster_9:0.01843120939250447 - cluster/prob_snapshot/cluster_10:0.04076796911298726 - cluster/prob_snapshot/cluster_11:0.019586744687778684 - cluster/prob_snapshot/cluster_12:0.015893937695784004 - cluster/prob_snapshot/cluster_13:0.025685955831552143 - cluster/prob_snapshot/cluster_14:0.02714816750065046 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.020710396515833483 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.022836503117615748 - cluster/prob_snapshot/cluster_19:0.022836503117615748 - cluster/prob_snapshot/cluster_20:0.023595214997594604 - cluster/prob_snapshot/cluster_21:0.019473897258925312 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02442520471675758 - cluster/prob_snapshot/cluster_24:0.03280543909458102 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.023726536139192962 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.015668048015967714 - cluster/prob_snapshot/cluster_31:0.011484467138227539 - cluster/prob_snapshot/cluster_32:0.012256355575856864 - cluster/prob_snapshot/cluster_33:0.02298625219722637 - cluster/prob_snapshot/cluster_34:0.02870010666731013 - cluster/prob_snapshot/cluster_35:0.02907105160434786 - cluster/prob_snapshot/cluster_36:0.013190245290519506 - cluster/prob_snapshot/cluster_37:0.016514038632657796 - cluster/prob_snapshot/cluster_38:0.02961142286954086 - cluster/prob_snapshot/cluster_39:0.025032233031288303 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.020623129474361464 - cluster/prob_snapshot/cluster_42:0.013691470434884235 - cluster/prob_snapshot/cluster_43:0.021030720687962766 - cluster/prob_snapshot/cluster_44:0.022192445939763127 - cluster/prob_snapshot/cluster_45:0.011700691195632591 - cluster/prob_snapshot/cluster_46:0.026162063014167618 - cluster/prob_snapshot/cluster_47:0.02724888697544827 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.024366014351008578 - cluster/prob_snapshot/cluster_50:0.012437415826658806 - cluster/prob_snapshot/cluster_51:0.01725724331476534 - cluster/prob_snapshot/cluster_52:0.011484467138227539 - cluster/prob_snapshot/cluster_53:0.017041019257360287 - cluster/prob_snapshot/cluster_54:0.0199638490203059 - cluster/prob_snapshot/cluster_55:0.02675532626124906 - cluster/prob_snapshot/cluster_56:0.022480957961962037 - cluster/prob_snapshot/cluster_57:0.01231845924188193 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.027296935601267662 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   9%|▉         | 70/800 [2:05:29<21:59:31, 108.45s/it]
[36m(TaskRunner pid=2823680)[0m step:70 - global_seqlen/min:357326 - global_seqlen/max:404914 - global_seqlen/minmax_diff:47588 - global_seqlen/balanced_min:378320 - global_seqlen/balanced_max:378448 - global_seqlen/mean:378358.75 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.2606248596372704) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009412276558578014 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06810796046920586) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003007661668637714) - actor/ppo_kl:np.float64(2.2649208747319943e-05) - actor/pg_clipfrac_lower:np.float64(1.021041650043723e-06) - actor/grad_norm:np.float64(0.2489968240261078) - perf/mfu/actor:np.float64(0.21217096831206658) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.42936706542969) - actor/lr:np.float64(1e-06) - training/global_step:70 - training/epoch:0 - critic/score/mean:0.5328947305679321 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.524553656578064 - critic/rewards/max:1.0037232637405396 - critic/rewards/min:-0.03809118643403053 - critic/advantages/mean:-0.1272253841161728 - critic/advantages/max:2.474850654602051 - critic/advantages/min:-2.474851369857788 - critic/returns/mean:-0.1272253841161728 - critic/returns/max:2.474850654602051 - critic/returns/min:-2.474851369857788 - response_length/mean:1115.9105224609375 - response_length/max:8192.0 - response_length/min:182.0 - response_length/clip_ratio:0.003947368357330561 - response_length_non_aborted/mean:1115.9105224609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:182.0 - response_length_non_aborted/clip_ratio:0.003947368357330561 - response/aborted_ratio:0.0 - prompt_length/mean:233.05262756347656 - prompt_length/max:728.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.723551243543625e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4240508042275906) - timing_s/agent_loop/generate_sequences/max:np.float64(27.615913948975503) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.848736278980141) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.615913948975503) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:232 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.609925695694983 - timing_s/reward:0.00024976395070552826 - timing_s/old_log_prob:8.566357675008476 - timing_s/ref:20.73612130433321 - timing_s/adv:0.07742306310683489 - timing_s/update_actor:20.84400178771466 - timing_s/update_weights:27.43171187955886 - timing_s/step:107.67261664476246 - timing_s/stop_profile:6.401259452104568e-05 - timing_per_token_ms/adv:7.551907615872121e-05 - timing_per_token_ms/update_actor:0.020331406370306494 - timing_per_token_ms/gen:0.034913577413411494 - timing_per_token_ms/ref:0.020226178882351366 - perf/total_num_tokens:1513435 - perf/time_per_step:107.67261664476246 - perf/throughput:3513.9737640842836 - frontier/active_count:46.0 - frontier/completed_count:18.0 - frontier/blacklisted_count:1283.0 - frontier/mean_score:2.3418732202819332 - frontier/mean_frontier_pct:0.518095486581634 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:11.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.107958398948056 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.211645699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.9625509999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:1.7068504750999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.7753833465699995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9902136069999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.581505699999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:1.501366187 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.773585699999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:2.9520333663409994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.236321669099999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.4659 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.4659 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.5478261869999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.1028037 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.746214299999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.5620063266386994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.3234489999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.6374489999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.0990556069999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.1391104750999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.2970036999999999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:3.1382220734299993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.7920974099999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.2268985189592994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:1.9348895248999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.270910475099999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.3963542999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:1.1844136069999998 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.8249960536589995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9423519899999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6310575869999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.1880699999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.8089962999999998 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.889056992999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.1992556069999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:1.3301549929999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:70.0 - cluster/prob_snapshot/cluster_0:0.028850542563732796 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02981305058614718 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.018217959795781252 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.015844344087252726 - cluster/prob_snapshot/cluster_8:0.01648052072527635 - cluster/prob_snapshot/cluster_9:0.018474746122441044 - cluster/prob_snapshot/cluster_10:0.04252918097249073 - cluster/prob_snapshot/cluster_11:0.019633010930847638 - cluster/prob_snapshot/cluster_12:0.013936875440949764 - cluster/prob_snapshot/cluster_13:0.0257466291437796 - cluster/prob_snapshot/cluster_14:0.027403122356466216 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02075931695945645 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.022890445680350216 - cluster/prob_snapshot/cluster_19:0.022890445680350216 - cluster/prob_snapshot/cluster_20:0.023650949728901135 - cluster/prob_snapshot/cluster_21:0.01951989694281579 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.025492546032179316 - cluster/prob_snapshot/cluster_24:0.03288292948123423 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.02378258106680593 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01570505782691272 - cluster/prob_snapshot/cluster_31:0.011511594828745003 - cluster/prob_snapshot/cluster_32:0.012285306559557894 - cluster/prob_snapshot/cluster_33:0.024482899983451878 - cluster/prob_snapshot/cluster_34:0.028767899765772438 - cluster/prob_snapshot/cluster_35:0.029139720919297174 - cluster/prob_snapshot/cluster_36:0.012039820244966644 - cluster/prob_snapshot/cluster_37:0.01655304677503203 - cluster/prob_snapshot/cluster_38:0.02913147406818015 - cluster/prob_snapshot/cluster_39:0.025918469563993476 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.020671843782752823 - cluster/prob_snapshot/cluster_42:0.01796118397631781 - cluster/prob_snapshot/cluster_43:0.021080397775747124 - cluster/prob_snapshot/cluster_44:0.022244867162100515 - cluster/prob_snapshot/cluster_45:0.010994669424591898 - cluster/prob_snapshot/cluster_46:0.026223860948734766 - cluster/prob_snapshot/cluster_47:0.02731325211872556 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.024423569802950972 - cluster/prob_snapshot/cluster_50:0.01246679449641524 - cluster/prob_snapshot/cluster_51:0.017298006995848048 - cluster/prob_snapshot/cluster_52:0.011511594828745003 - cluster/prob_snapshot/cluster_53:0.020311406577640575 - cluster/prob_snapshot/cluster_54:0.016792542901619904 - cluster/prob_snapshot/cluster_55:0.02681852555484911 - cluster/prob_snapshot/cluster_56:0.029698035885311388 - cluster/prob_snapshot/cluster_57:0.012347556921899963 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.027507506012581416 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   9%|▉         | 71/800 [2:07:18<21:56:26, 108.35s/it]
[36m(TaskRunner pid=2823680)[0m step:71 - global_seqlen/min:386715 - global_seqlen/max:417842 - global_seqlen/minmax_diff:31127 - global_seqlen/balanced_min:397485 - global_seqlen/balanced_max:397593 - global_seqlen/mean:397541.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.28400147350298033) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009102250449359417 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07115585063729668) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002873958889823472) - actor/ppo_kl:np.float64(7.664656289163575e-06) - actor/pg_clipfrac_lower:np.float64(1.2311836649637875e-06) - actor/grad_norm:np.float64(0.20715781301259995) - perf/mfu/actor:np.float64(0.23132601138396325) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.00920867919922) - actor/lr:np.float64(1e-06) - training/global_step:71 - training/epoch:0 - critic/score/mean:0.5280898809432983 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5200673937797546 - critic/rewards/max:1.0021333694458008 - critic/rewards/min:-0.06613016128540039 - critic/advantages/mean:-0.07055370509624481 - critic/advantages/max:2.474846363067627 - critic/advantages/min:-2.474816083908081 - critic/returns/mean:-0.07055370509624481 - critic/returns/max:2.474846363067627 - critic/returns/min:-2.474816083908081 - response_length/mean:1142.846923828125 - response_length/max:8192.0 - response_length/min:160.0 - response_length/clip_ratio:0.004213483072817326 - response_length_non_aborted/mean:1142.846923828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:160.0 - response_length_non_aborted/clip_ratio:0.004213483072817326 - response/aborted_ratio:0.0 - prompt_length/mean:250.80899047851562 - prompt_length/max:1168.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.813664317131042e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2166177481412888) - timing_s/agent_loop/generate_sequences/max:np.float64(28.344745594076812) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.1889693080302095) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.344745594076812) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:214 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.228207145817578 - timing_s/reward:0.0001784069463610649 - timing_s/old_log_prob:8.846795839257538 - timing_s/ref:20.287037544883788 - timing_s/adv:0.07324162125587463 - timing_s/update_actor:20.149194388650358 - timing_s/update_weights:27.854371681809425 - timing_s/step:107.89390648435801 - timing_s/stop_profile:7.594842463731766e-05 - timing_per_token_ms/adv:7.381122245959533e-05 - timing_per_token_ms/update_actor:0.02030589498021266 - timing_per_token_ms/gen:0.037148761342617895 - timing_per_token_ms/ref:0.020444810144770984 - perf/total_num_tokens:1590166 - perf/time_per_step:107.89390648435801 - perf/throughput:3684.5593319733384 - frontier/active_count:43.0 - frontier/completed_count:21.0 - frontier/blacklisted_count:1321.0 - frontier/mean_score:2.3548012899156 - frontier/mean_frontier_pct:0.5172812717101686 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:13.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.211645699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.9625509999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:1.7068504750999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.7753833465699995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9902136069999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.581505699999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:1.501366187 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.841509989999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:2.9664233564386993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.236321669099999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.4659 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.62613 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.5478261869999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:1.77196259 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.746214299999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.5620063266386994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:1.6918456999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.3234489999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.6374489999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.0990556069999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.0973773325699994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:3.1382220734299993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.8544681869999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:1.8588289632715096 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:1.9348895248999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:1.8896373325699993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.5774480099999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.8249960536589995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9596463929999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6310575869999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.1880699999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.8089962999999998 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.889056992999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.1394789248999992 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.2311084950999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:71.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03171793526849898 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.019381984002509352 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.016856707725334874 - cluster/prob_snapshot/cluster_8:0.017533532438923246 - cluster/prob_snapshot/cluster_9:0.01965517751765454 - cluster/prob_snapshot/cluster_10:0.045246554196454215 - cluster/prob_snapshot/cluster_11:0.020887448871794032 - cluster/prob_snapshot/cluster_12:0.01482736266132318 - cluster/prob_snapshot/cluster_13:0.028062506996837537 - cluster/prob_snapshot/cluster_14:0.02929614060432823 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.022085719461538172 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.024353015209178167 - cluster/prob_snapshot/cluster_19:0.02593543283640012 - cluster/prob_snapshot/cluster_20:0.025162111149013913 - cluster/prob_snapshot/cluster_21:0.017499749342781434 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02712137500124196 - cluster/prob_snapshot/cluster_24:0.034983961997133925 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.025302152982133473 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016708521863693853 - cluster/prob_snapshot/cluster_31:0.012247120386431665 - cluster/prob_snapshot/cluster_32:0.013070267904445286 - cluster/prob_snapshot/cluster_33:0.02604721830180937 - cluster/prob_snapshot/cluster_34:0.0306060052440731 - cluster/prob_snapshot/cluster_35:0.030589430750898615 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.017610692491524485 - cluster/prob_snapshot/cluster_38:0.030992809880376095 - cluster/prob_snapshot/cluster_39:0.02819048103010106 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.018357634135127927 - cluster/prob_snapshot/cluster_42:0.019108801665910707 - cluster/prob_snapshot/cluster_43:0.018661894926764285 - cluster/prob_snapshot/cluster_44:0.025454653711989936 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.027899416789255817 - cluster/prob_snapshot/cluster_47:0.029229211899313958 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.025984097259594708 - cluster/prob_snapshot/cluster_50:0.012247120386431665 - cluster/prob_snapshot/cluster_51:0.01840325144127612 - cluster/prob_snapshot/cluster_52:0.012247120386431665 - cluster/prob_snapshot/cluster_53:0.021609190149132755 - cluster/prob_snapshot/cluster_54:0.017865491060970445 - cluster/prob_snapshot/cluster_55:0.028532077087761676 - cluster/prob_snapshot/cluster_56:0.03100522243683199 - cluster/prob_snapshot/cluster_57:0.01215832106140506 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.029265079485368325 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   9%|▉         | 72/800 [2:09:11<22:14:36, 110.00s/it]
[36m(TaskRunner pid=2823680)[0m step:72 - global_seqlen/min:354914 - global_seqlen/max:421512 - global_seqlen/minmax_diff:66598 - global_seqlen/balanced_min:397124 - global_seqlen/balanced_max:397309 - global_seqlen/mean:397198.0 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.2744856054449211) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009067818522453308 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0676195919577367) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00020698090682581886) - actor/ppo_kl:np.float64(-3.4782287685956085e-05) - actor/pg_clipfrac_lower:np.float64(6.516644450881437e-06) - actor/grad_norm:np.float64(0.21638900662461916) - perf/mfu/actor:np.float64(0.22526761746820198) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(115.12422180175781) - actor/lr:np.float64(1e-06) - training/global_step:72 - training/epoch:0 - critic/score/mean:0.4958791136741638 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4877162277698517 - critic/rewards/max:1.0030956268310547 - critic/rewards/min:-0.056321751326322556 - critic/advantages/mean:-0.21052509546279907 - critic/advantages/max:2.4748430252075195 - critic/advantages/min:-2.4748570919036865 - critic/returns/mean:-0.21052509546279907 - critic/returns/max:2.4748430252075195 - critic/returns/min:-2.4748570919036865 - response_length/mean:1221.17578125 - response_length/max:8192.0 - response_length/min:196.0 - response_length/clip_ratio:0.010989011265337467 - response_length_non_aborted/mean:1221.17578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:196.0 - response_length_non_aborted/clip_ratio:0.010989011265337467 - response/aborted_ratio:0.0 - prompt_length/mean:242.2087860107422 - prompt_length/max:728.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.776970207691193e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4873569589108229) - timing_s/agent_loop/generate_sequences/max:np.float64(29.000929209403694) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.171633494603157) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.000929209403694) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:254 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.873724673874676 - timing_s/reward:0.00017616432160139084 - timing_s/old_log_prob:9.381167543120682 - timing_s/ref:21.738709080033004 - timing_s/adv:0.0743280965834856 - timing_s/update_actor:20.711603133939207 - timing_s/update_weights:29.868274562060833 - timing_s/step:113.05372241511941 - timing_s/stop_profile:6.654392927885056e-05 - timing_per_token_ms/adv:6.976910423627073e-05 - timing_per_token_ms/update_actor:0.019441235069554252 - timing_per_token_ms/gen:0.034727974157804445 - timing_per_token_ms/ref:0.020405342387090936 - perf/total_num_tokens:1588792 - perf/time_per_step:113.05372241511941 - perf/throughput:3513.356230248993 - frontier/active_count:42.0 - frontier/completed_count:22.0 - frontier/blacklisted_count:1358.0 - frontier/mean_score:2.3762930820800023 - frontier/mean_frontier_pct:0.5406292498794055 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:14.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.211645699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:1.6737856999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:1.7068504750999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.5427683425989995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9902136069999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:4.7070539899999995 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:1.501366187 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.841509989999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:2.9664233564386993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.4654251683699995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.4659 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:2.7382909999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.5478261869999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:1.77196259 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.222350009999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.5620063266386994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0842919899999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.3234489999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.6374489999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.4693389248999993 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.0681641327989992 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:3.1382220734299993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.8981277308999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:1.8588289632715096 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:128.0 - frontier/cluster_42/score:2.2544226674299996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.2227461327989992 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.1042136069999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.8774972375612995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9596463929999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6310575869999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.1880699999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:2.1662974099999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.889056992999999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.1394789248999992 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.2311084950999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:72.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03217942910264733 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01677067562781752 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.017101971693920148 - cluster/prob_snapshot/cluster_8:0.015457933140779895 - cluster/prob_snapshot/cluster_9:0.019941159034317056 - cluster/prob_snapshot/cluster_10:0.04716283310875115 - cluster/prob_snapshot/cluster_11:0.02119135985413957 - cluster/prob_snapshot/cluster_12:0.015043099795123248 - cluster/prob_snapshot/cluster_13:0.028470814594420895 - cluster/prob_snapshot/cluster_14:0.029722397488289656 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.024702592323133515 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.024707349949659164 - cluster/prob_snapshot/cluster_19:0.027436600835801177 - cluster/prob_snapshot/cluster_20:0.02552821818123798 - cluster/prob_snapshot/cluster_21:0.01775436952383893 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.022267074661461744 - cluster/prob_snapshot/cluster_24:0.035492976301472696 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.025670297613651458 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.020883787499169262 - cluster/prob_snapshot/cluster_31:0.012425315167919351 - cluster/prob_snapshot/cluster_32:0.013260439427197561 - cluster/prob_snapshot/cluster_33:0.026426203583834944 - cluster/prob_snapshot/cluster_34:0.024741806627121703 - cluster/prob_snapshot/cluster_35:0.03074180012654909 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.017866926887966827 - cluster/prob_snapshot/cluster_38:0.031443753188685655 - cluster/prob_snapshot/cluster_39:0.029038102131537336 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.01862473648246536 - cluster/prob_snapshot/cluster_42:0.022588430097991434 - cluster/prob_snapshot/cluster_43:0.022271043656399876 - cluster/prob_snapshot/cluster_44:0.021083394280783308 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.028831392687296547 - cluster/prob_snapshot/cluster_47:0.029654494975099338 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.026362164134682994 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01867101751736322 - cluster/prob_snapshot/cluster_52:0.012425315167919351 - cluster/prob_snapshot/cluster_53:0.02192360241873179 - cluster/prob_snapshot/cluster_54:0.0217054496143032 - cluster/prob_snapshot/cluster_55:0.028947216898722976 - cluster/prob_snapshot/cluster_56:0.031456346347006774 - cluster/prob_snapshot/cluster_57:0.012335223818660103 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.029690884432128903 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:   9%|▉         | 73/800 [2:11:04<22:22:24, 110.79s/it]
[36m(TaskRunner pid=2823680)[0m step:73 - global_seqlen/min:350063 - global_seqlen/max:424058 - global_seqlen/minmax_diff:73995 - global_seqlen/balanced_min:395870 - global_seqlen/balanced_max:395957 - global_seqlen/mean:395922.75 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.2736912819361964) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010280769318342209 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.036944754916476086) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00030572240474040967) - actor/ppo_kl:np.float64(2.1278526674987112e-05) - actor/pg_clipfrac_lower:np.float64(1.5292597772502125e-06) - actor/grad_norm:np.float64(0.22673798691142688) - perf/mfu/actor:np.float64(0.24443830298467978) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(107.43047714233398) - actor/lr:np.float64(1e-06) - training/global_step:73 - training/epoch:0 - critic/score/mean:0.5455882549285889 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5375506281852722 - critic/rewards/max:1.0026181936264038 - critic/rewards/min:-0.04480978846549988 - critic/advantages/mean:-0.16694945096969604 - critic/advantages/max:2.4748525619506836 - critic/advantages/min:-2.4748497009277344 - critic/returns/mean:-0.16694945096969604 - critic/returns/max:2.4748525619506836 - critic/returns/min:-2.4748497009277344 - response_length/mean:1161.5264892578125 - response_length/max:8192.0 - response_length/min:160.0 - response_length/clip_ratio:0.014705882407724857 - response_length_non_aborted/mean:1161.5264892578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:160.0 - response_length_non_aborted/clip_ratio:0.014705882407724857 - response/aborted_ratio:0.0 - prompt_length/mean:229.16470336914062 - prompt_length/max:399.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.920114487409592e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1907328013330698) - timing_s/agent_loop/generate_sequences/max:np.float64(30.161608466878533) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.048711609834754) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.161608466878533) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.654034407809377 - timing_s/reward:0.00014877784997224808 - timing_s/old_log_prob:9.886720173060894 - timing_s/ref:20.324311410076916 - timing_s/adv:0.070868038572371 - timing_s/update_actor:19.161650756374 - timing_s/update_weights:30.928167366422713 - timing_s/step:112.42539807781577 - timing_s/stop_profile:5.30146062374115e-05 - timing_per_token_ms/adv:7.493950169971661e-05 - timing_per_token_ms/update_actor:0.020262513092700416 - timing_per_token_ms/gen:0.040076616227390144 - timing_per_token_ms/ref:0.021491970148230267 - perf/total_num_tokens:1583691 - perf/time_per_step:112.42539807781577 - perf/throughput:3521.6486378456957 - frontier/active_count:42.0 - frontier/completed_count:22.0 - frontier/blacklisted_count:1401.0 - frontier/mean_score:2.3054868775930975 - frontier/mean_frontier_pct:0.5701213957890696 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:14.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.211645699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:1.6737856999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:1.7068504750999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.5427683425989995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9902136069999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.794937792999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:1.501366187 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.376496349507089 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.0257976178589994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.4659 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.2168036999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.5478261869999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.140373813 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.222350009999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.5620063266386994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0842919899999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.2401 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.3234489999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.146214299999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.4693389248999993 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.0477148929592994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:3.096755451400999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.9286894116299993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:1.8588289632715096 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:1.8780958672009997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:1.8559222929592993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.1042136069999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.8774972375612995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:2.371752475099999 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6310575869999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:1.8316489999999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.416408186999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.922339895099999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.1394789248999992 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.2311084950999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:73.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.033167725006414364 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017285737283308917 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.017627208127342472 - cluster/prob_snapshot/cluster_8:0.015932677796908074 - cluster/prob_snapshot/cluster_9:0.020553592702021905 - cluster/prob_snapshot/cluster_10:0.0495189049467964 - cluster/prob_snapshot/cluster_11:0.021842189739041923 - cluster/prob_snapshot/cluster_12:0.015505104073075336 - cluster/prob_snapshot/cluster_13:0.0298362449730001 - cluster/prob_snapshot/cluster_14:0.024542862059535004 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.020921080525101693 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.025466163061920934 - cluster/prob_snapshot/cluster_19:0.022893663368534666 - cluster/prob_snapshot/cluster_20:0.026312241831207367 - cluster/prob_snapshot/cluster_21:0.022104346703160496 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.022950941942220526 - cluster/prob_snapshot/cluster_24:0.03658303799832114 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.026458684812788104 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.021525171209698553 - cluster/prob_snapshot/cluster_31:0.01280692194050373 - cluster/prob_snapshot/cluster_32:0.013667694569178067 - cluster/prob_snapshot/cluster_33:0.022164663339805536 - cluster/prob_snapshot/cluster_34:0.02550167797422925 - cluster/prob_snapshot/cluster_35:0.031474757463946804 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.018415656655668952 - cluster/prob_snapshot/cluster_38:0.03198121549465526 - cluster/prob_snapshot/cluster_39:0.030245542039130068 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.019196740128510362 - cluster/prob_snapshot/cluster_42:0.019395715803585085 - cluster/prob_snapshot/cluster_43:0.019166721984977375 - cluster/prob_snapshot/cluster_44:0.021730908322711705 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.029716863563795406 - cluster/prob_snapshot/cluster_47:0.02449387050302573 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.02717179996587299 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01924444254838885 - cluster/prob_snapshot/cluster_52:0.01280692194050373 - cluster/prob_snapshot/cluster_53:0.018916043678253138 - cluster/prob_snapshot/cluster_54:0.024955044776472163 - cluster/prob_snapshot/cluster_55:0.03017996848654588 - cluster/prob_snapshot/cluster_56:0.0324224349044842 - cluster/prob_snapshot/cluster_57:0.012714063702150403 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.030602752053207345 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 13:43:29,368:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:   9%|▉         | 74/800 [2:12:55<22:21:17, 110.85s/it]
[36m(TaskRunner pid=2823680)[0m step:74 - global_seqlen/min:370222 - global_seqlen/max:439252 - global_seqlen/minmax_diff:69030 - global_seqlen/balanced_min:412522 - global_seqlen/balanced_max:412768 - global_seqlen/mean:412662.0 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.30591735647370416) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009996124543249607 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.032618924262351356) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00025157802440389787) - actor/ppo_kl:np.float64(1.8716357921870024e-05) - actor/pg_clipfrac_lower:np.float64(1.0443820883665467e-06) - actor/grad_norm:np.float64(0.25481683015823364) - perf/mfu/actor:np.float64(0.23687341210510418) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.14236068725586) - actor/lr:np.float64(1e-06) - training/global_step:74 - training/epoch:0 - critic/score/mean:0.5105262994766235 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5023016929626465 - critic/rewards/max:1.0009053945541382 - critic/rewards/min:-0.03667470067739487 - critic/advantages/mean:-0.16381993889808655 - critic/advantages/max:2.474853038787842 - critic/advantages/min:-2.474844217300415 - critic/returns/mean:-0.16381993889808655 - critic/returns/max:2.474853038787842 - critic/returns/min:-2.474844217300415 - response_length/mean:1177.60400390625 - response_length/max:8192.0 - response_length/min:129.0 - response_length/clip_ratio:0.01315789483487606 - response_length_non_aborted/mean:1177.60400390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:129.0 - response_length_non_aborted/clip_ratio:0.01315789483487606 - response/aborted_ratio:0.0 - prompt_length/mean:242.85263061523438 - prompt_length/max:1168.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.184889495372772e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.029175833798945) - timing_s/agent_loop/generate_sequences/max:np.float64(29.524476249702275) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.096731368670589) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.524476249702275) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:203 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.87494099047035 - timing_s/reward:0.00016097351908683777 - timing_s/old_log_prob:9.547489621676505 - timing_s/ref:21.271298008970916 - timing_s/adv:0.12528030946850777 - timing_s/update_actor:20.72006985824555 - timing_s/update_weights:27.760891237296164 - timing_s/step:110.80081474594772 - timing_s/stop_profile:5.228538066148758e-05 - timing_per_token_ms/adv:0.00011604896263757647 - timing_per_token_ms/update_actor:0.019193300391965845 - timing_per_token_ms/gen:0.03449795022058657 - timing_per_token_ms/ref:0.019703911000605732 - perf/total_num_tokens:1650648 - perf/time_per_step:110.80081474594772 - perf/throughput:3724.35889524984 - frontier/active_count:39.0 - frontier/completed_count:25.0 - frontier/blacklisted_count:1434.0 - frontier/mean_score:2.339821774211092 - frontier/mean_frontier_pct:0.5675538573121794 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:15.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:2.548151989999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.4716499899999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:1.7068504750999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.5427683425989995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9902136069999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.794937792999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:1.501366187 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.376496349507089 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.0257976178589994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.4659 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.4517625899999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.6834783308999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.140373813 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.222350009999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0842919899999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.7680699999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.3234489999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.4023500099999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.0477148929592994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:3.0677288159806992 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.950082588140999 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:2.2011802742900564 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:1.6146671070406997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.1991456050715095 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.1042136069999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.8774972375612995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:2.371752475099999 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6310575869999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:1.8316489999999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.416408186999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.945637926569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:3.097635247429999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:74.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.027924020210834212 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01612713222182401 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.018704585656825118 - cluster/prob_snapshot/cluster_8:0.016906485385657737 - cluster/prob_snapshot/cluster_9:0.02180983128315877 - cluster/prob_snapshot/cluster_10:0.05254550763332795 - cluster/prob_snapshot/cluster_11:0.02317718755886332 - cluster/prob_snapshot/cluster_12:0.016452778293515807 - cluster/prob_snapshot/cluster_13:0.031659840613661325 - cluster/prob_snapshot/cluster_14:0.026042925365142643 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02219978001553766 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0270226586593432 - cluster/prob_snapshot/cluster_19:0.02686773331575376 - cluster/prob_snapshot/cluster_20:0.02940699904929426 - cluster/prob_snapshot/cluster_21:0.02345536759483186 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.024353706858274028 - cluster/prob_snapshot/cluster_24:0.03881899861187198 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.022840792810808697 - cluster/prob_snapshot/cluster_31:0.019375462141946118 - cluster/prob_snapshot/cluster_32:0.014503066052982318 - cluster/prob_snapshot/cluster_33:0.02632624368405033 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03339849922682828 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.019541224274100228 - cluster/prob_snapshot/cluster_38:0.03361782256120585 - cluster/prob_snapshot/cluster_39:0.03232859191216431 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.024121717506719507 - cluster/prob_snapshot/cluster_42:0.017694390722263677 - cluster/prob_snapshot/cluster_43:0.024099420506931413 - cluster/prob_snapshot/cluster_44:0.023059104606150422 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.03153316259532909 - cluster/prob_snapshot/cluster_47:0.025990939437560187 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.028832544339420157 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.020420665681358672 - cluster/prob_snapshot/cluster_52:0.013589682875806603 - cluster/prob_snapshot/cluster_53:0.020072195024424068 - cluster/prob_snapshot/cluster_54:0.026480300749804668 - cluster/prob_snapshot/cluster_55:0.032279884919792584 - cluster/prob_snapshot/cluster_56:0.033945553324324175 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03247319671831072 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_75
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 75}
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 13:48:06,901:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:   9%|▉         | 75/800 [2:17:54<33:42:03, 167.34s/it]
[36m(TaskRunner pid=2823680)[0m step:75 - global_seqlen/min:335048 - global_seqlen/max:408060 - global_seqlen/minmax_diff:73012 - global_seqlen/balanced_min:371075 - global_seqlen/balanced_max:371192 - global_seqlen/mean:371157.25 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.3088336983928457) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010224010795354843 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.0259022104437463) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00032750057050634496) - actor/ppo_kl:np.float64(4.138315230939327e-05) - actor/pg_clipfrac_lower:np.float64(1.3480616492718884e-06) - actor/grad_norm:np.float64(0.25160715108116466) - perf/mfu/actor:np.float64(0.2214553425956557) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.67436599731445) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.09791666666666667) - val-aux/aime2024/reward/std@16:np.float64(0.1322825199862365) - val-aux/aime2024/reward/best@2/mean:np.float64(0.15366666666666665) - val-aux/aime2024/reward/best@2/std:np.float64(0.1391241129761839) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.043500000000000004) - val-aux/aime2024/reward/worst@2/std:np.float64(0.0860760529078785) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.09786666666666664) - val-aux/aime2024/reward/maj@2/std:np.float64(0.13195065397236647) - val-aux/aime2024/reward/best@4/mean:np.float64(0.21300000000000002) - val-aux/aime2024/reward/best@4/std:np.float64(0.12592973877427757) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.013966666666666667) - val-aux/aime2024/reward/worst@4/std:np.float64(0.0373695219020352) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.12026666666666669) - val-aux/aime2024/reward/maj@4/std:np.float64(0.1230282785865095) - val-aux/aime2024/reward/best@8/mean:np.float64(0.2657) - val-aux/aime2024/reward/best@8/std:np.float64(0.09592524135242315) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.002) - val-aux/aime2024/reward/worst@8/std:np.float64(0.012481974011055523) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.1413666666666667) - val-aux/aime2024/reward/maj@8/std:np.float64(0.10842574767223165) - val-aux/aime2024/reward/best@16/mean:np.float64(0.3049333333333333) - val-aux/aime2024/reward/best@16/std:np.float64(0.05816365881317465) - val-aux/aime2024/reward/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0010535653752852738) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.15756666666666666) - val-aux/aime2024/reward/maj@16/std:np.float64(0.08902878617167392) - val-aux/aime2024/score/mean@16:np.float64(0.09791666666666667) - val-aux/aime2024/score/std@16:np.float64(0.1322825199862365) - val-aux/aime2024/score/best@2/mean:np.float64(0.15366666666666665) - val-aux/aime2024/score/best@2/std:np.float64(0.1391241129761839) - val-aux/aime2024/score/worst@2/mean:np.float64(0.043500000000000004) - val-aux/aime2024/score/worst@2/std:np.float64(0.0860760529078785) - val-aux/aime2024/score/maj@2/mean:np.float64(0.09786666666666664) - val-aux/aime2024/score/maj@2/std:np.float64(0.13195065397236647) - val-aux/aime2024/score/best@4/mean:np.float64(0.21300000000000002) - val-aux/aime2024/score/best@4/std:np.float64(0.12592973877427757) - val-aux/aime2024/score/worst@4/mean:np.float64(0.013966666666666667) - val-aux/aime2024/score/worst@4/std:np.float64(0.0373695219020352) - val-aux/aime2024/score/maj@4/mean:np.float64(0.12026666666666669) - val-aux/aime2024/score/maj@4/std:np.float64(0.1230282785865095) - val-aux/aime2024/score/best@8/mean:np.float64(0.2657) - val-aux/aime2024/score/best@8/std:np.float64(0.09592524135242315) - val-aux/aime2024/score/worst@8/mean:np.float64(0.002) - val-aux/aime2024/score/worst@8/std:np.float64(0.012481974011055523) - val-aux/aime2024/score/maj@8/mean:np.float64(0.1413666666666667) - val-aux/aime2024/score/maj@8/std:np.float64(0.10842574767223165) - val-aux/aime2024/score/best@16/mean:np.float64(0.3049333333333333) - val-aux/aime2024/score/best@16/std:np.float64(0.05816365881317465) - val-aux/aime2024/score/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/score/worst@16/std:np.float64(0.0010535653752852738) - val-aux/aime2024/score/maj@16/mean:np.float64(0.15756666666666666) - val-aux/aime2024/score/maj@16/std:np.float64(0.08902878617167392) - val-core/aime2024/acc/mean@16:np.float64(0.09791666666666667) - val-aux/aime2024/acc/std@16:np.float64(0.1322825199862365) - val-aux/aime2024/acc/best@2/mean:np.float64(0.15366666666666665) - val-aux/aime2024/acc/best@2/std:np.float64(0.1391241129761839) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.043500000000000004) - val-aux/aime2024/acc/worst@2/std:np.float64(0.0860760529078785) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.09786666666666664) - val-aux/aime2024/acc/maj@2/std:np.float64(0.13195065397236647) - val-aux/aime2024/acc/best@4/mean:np.float64(0.21300000000000002) - val-aux/aime2024/acc/best@4/std:np.float64(0.12592973877427757) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.013966666666666667) - val-aux/aime2024/acc/worst@4/std:np.float64(0.0373695219020352) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.12026666666666669) - val-aux/aime2024/acc/maj@4/std:np.float64(0.1230282785865095) - val-aux/aime2024/acc/best@8/mean:np.float64(0.2657) - val-aux/aime2024/acc/best@8/std:np.float64(0.09592524135242315) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.002) - val-aux/aime2024/acc/worst@8/std:np.float64(0.012481974011055523) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.1413666666666667) - val-aux/aime2024/acc/maj@8/std:np.float64(0.10842574767223165) - val-core/aime2024/acc/best@16/mean:np.float64(0.3049333333333333) - val-core/aime2024/acc/best@16/std:np.float64(0.05816365881317465) - val-aux/aime2024/acc/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0010535653752852738) - val-core/aime2024/acc/maj@16/mean:np.float64(0.15756666666666666) - val-core/aime2024/acc/maj@16/std:np.float64(0.08902878617167392) - val-aux/aime2025/reward/mean@16:np.float64(0.035416666666666666) - val-aux/aime2025/reward/std@16:np.float64(0.08420695905512335) - val-aux/aime2025/reward/best@2/mean:np.float64(0.0644) - val-aux/aime2025/reward/best@2/std:np.float64(0.10472772474405069) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0072) - val-aux/aime2025/reward/worst@2/std:np.float64(0.0336075307812174) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.03623333333333334) - val-aux/aime2025/reward/maj@2/std:np.float64(0.08529221733626574) - val-aux/aime2025/reward/best@4/mean:np.float64(0.10673333333333333) - val-aux/aime2025/reward/best@4/std:np.float64(0.1180730919796044) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.00043333333333333337) - val-aux/aime2025/reward/worst@4/std:np.float64(0.00618377661817555) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.039) - val-aux/aime2025/reward/maj@4/std:np.float64(0.08309222316235595) - val-aux/aime2025/reward/best@8/mean:np.float64(0.1584) - val-aux/aime2025/reward/best@8/std:np.float64(0.11563572218257792) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.04233333333333334) - val-aux/aime2025/reward/maj@8/std:np.float64(0.07719102321735569) - val-aux/aime2025/reward/best@16/mean:np.float64(0.21253333333333332) - val-aux/aime2025/reward/best@16/std:np.float64(0.09063474820636151) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.04800000000000001) - val-aux/aime2025/reward/maj@16/std:np.float64(0.06787077315598661) - val-aux/aime2025/score/mean@16:np.float64(0.035416666666666666) - val-aux/aime2025/score/std@16:np.float64(0.08420695905512335) - val-aux/aime2025/score/best@2/mean:np.float64(0.0644) - val-aux/aime2025/score/best@2/std:np.float64(0.10472772474405069) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0072) - val-aux/aime2025/score/worst@2/std:np.float64(0.0336075307812174) - val-aux/aime2025/score/maj@2/mean:np.float64(0.03623333333333334) - val-aux/aime2025/score/maj@2/std:np.float64(0.08529221733626574) - val-aux/aime2025/score/best@4/mean:np.float64(0.10673333333333333) - val-aux/aime2025/score/best@4/std:np.float64(0.1180730919796044) - val-aux/aime2025/score/worst@4/mean:np.float64(0.00043333333333333337) - val-aux/aime2025/score/worst@4/std:np.float64(0.00618377661817555) - val-aux/aime2025/score/maj@4/mean:np.float64(0.039) - val-aux/aime2025/score/maj@4/std:np.float64(0.08309222316235595) - val-aux/aime2025/score/best@8/mean:np.float64(0.1584) - val-aux/aime2025/score/best@8/std:np.float64(0.11563572218257792) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0) - val-aux/aime2025/score/worst@8/std:np.float64(0.0) - val-aux/aime2025/score/maj@8/mean:np.float64(0.04233333333333334) - val-aux/aime2025/score/maj@8/std:np.float64(0.07719102321735569) - val-aux/aime2025/score/best@16/mean:np.float64(0.21253333333333332) - val-aux/aime2025/score/best@16/std:np.float64(0.09063474820636151) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.04800000000000001) - val-aux/aime2025/score/maj@16/std:np.float64(0.06787077315598661) - val-core/aime2025/acc/mean@16:np.float64(0.035416666666666666) - val-aux/aime2025/acc/std@16:np.float64(0.08420695905512335) - val-aux/aime2025/acc/best@2/mean:np.float64(0.0644) - val-aux/aime2025/acc/best@2/std:np.float64(0.10472772474405069) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0072) - val-aux/aime2025/acc/worst@2/std:np.float64(0.0336075307812174) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.03623333333333334) - val-aux/aime2025/acc/maj@2/std:np.float64(0.08529221733626574) - val-aux/aime2025/acc/best@4/mean:np.float64(0.10673333333333333) - val-aux/aime2025/acc/best@4/std:np.float64(0.1180730919796044) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.00043333333333333337) - val-aux/aime2025/acc/worst@4/std:np.float64(0.00618377661817555) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.039) - val-aux/aime2025/acc/maj@4/std:np.float64(0.08309222316235595) - val-aux/aime2025/acc/best@8/mean:np.float64(0.1584) - val-aux/aime2025/acc/best@8/std:np.float64(0.11563572218257792) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.04233333333333334) - val-aux/aime2025/acc/maj@8/std:np.float64(0.07719102321735569) - val-core/aime2025/acc/best@16/mean:np.float64(0.21253333333333332) - val-core/aime2025/acc/best@16/std:np.float64(0.09063474820636151) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.04800000000000001) - val-core/aime2025/acc/maj@16/std:np.float64(0.06787077315598661) - val-aux/math500/reward/mean@4:np.float64(0.6475) - val-aux/math500/reward/std@4:np.float64(0.15132497224277933) - val-aux/math500/reward/best@2/mean:np.float64(0.7149759999999998) - val-aux/math500/reward/best@2/std:np.float64(0.12422905103405338) - val-aux/math500/reward/worst@2/mean:np.float64(0.57933) - val-aux/math500/reward/worst@2/std:np.float64(0.13628844217372224) - val-aux/math500/reward/maj@2/mean:np.float64(0.646764) - val-aux/math500/reward/maj@2/std:np.float64(0.15124901011608366) - val-aux/math500/reward/best@4/mean:np.float64(0.764632) - val-aux/math500/reward/best@4/std:np.float64(0.07629980693826069) - val-aux/math500/reward/worst@4/mean:np.float64(0.5211099999999999) - val-aux/math500/reward/worst@4/std:np.float64(0.09683498805430178) - val-aux/math500/reward/maj@4/mean:np.float64(0.663928) - val-aux/math500/reward/maj@4/std:np.float64(0.13881272017489982) - val-aux/math500/score/mean@4:np.float64(0.6475) - val-aux/math500/score/std@4:np.float64(0.15132497224277933) - val-aux/math500/score/best@2/mean:np.float64(0.7149759999999998) - val-aux/math500/score/best@2/std:np.float64(0.12422905103405338) - val-aux/math500/score/worst@2/mean:np.float64(0.57933) - val-aux/math500/score/worst@2/std:np.float64(0.13628844217372224) - val-aux/math500/score/maj@2/mean:np.float64(0.646764) - val-aux/math500/score/maj@2/std:np.float64(0.15124901011608366) - val-aux/math500/score/best@4/mean:np.float64(0.764632) - val-aux/math500/score/best@4/std:np.float64(0.07629980693826069) - val-aux/math500/score/worst@4/mean:np.float64(0.5211099999999999) - val-aux/math500/score/worst@4/std:np.float64(0.09683498805430178) - val-aux/math500/score/maj@4/mean:np.float64(0.663928) - val-aux/math500/score/maj@4/std:np.float64(0.13881272017489982) - val-core/math500/acc/mean@4:np.float64(0.6475) - val-aux/math500/acc/std@4:np.float64(0.15132497224277933) - val-aux/math500/acc/best@2/mean:np.float64(0.7149759999999998) - val-aux/math500/acc/best@2/std:np.float64(0.12422905103405338) - val-aux/math500/acc/worst@2/mean:np.float64(0.57933) - val-aux/math500/acc/worst@2/std:np.float64(0.13628844217372224) - val-aux/math500/acc/maj@2/mean:np.float64(0.646764) - val-aux/math500/acc/maj@2/std:np.float64(0.15124901011608366) - val-core/math500/acc/best@4/mean:np.float64(0.764632) - val-core/math500/acc/best@4/std:np.float64(0.07629980693826069) - val-aux/math500/acc/worst@4/mean:np.float64(0.5211099999999999) - val-aux/math500/acc/worst@4/std:np.float64(0.09683498805430178) - val-core/math500/acc/maj@4/mean:np.float64(0.663928) - val-core/math500/acc/maj@4/std:np.float64(0.13881272017489982) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.07027027027027027 - val-aux/aime2024/response_length/clip_ratio:0.1875 - val-aux/aime2025/response_length/clip_ratio:0.11666666666666667 - val-aux/math500/response_length/clip_ratio:0.031 - training/global_step:75 - training/epoch:0 - critic/score/mean:0.5118421316146851 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5038412809371948 - critic/rewards/max:1.0003553628921509 - critic/rewards/min:-0.06247643008828163 - critic/advantages/mean:-0.1292634755373001 - critic/advantages/max:2.474836826324463 - critic/advantages/min:-2.474846839904785 - critic/returns/mean:-0.1292634755373001 - critic/returns/max:2.474836826324463 - critic/returns/min:-2.474846839904785 - response_length/mean:1079.0250244140625 - response_length/max:8192.0 - response_length/min:131.0 - response_length/clip_ratio:0.005263158120214939 - response_length_non_aborted/mean:1079.0250244140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:131.0 - response_length_non_aborted/clip_ratio:0.005263158120214939 - response/aborted_ratio:0.0 - prompt_length/mean:233.03158569335938 - prompt_length/max:458.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.265860378742218e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.102314286865294) - timing_s/agent_loop/generate_sequences/max:np.float64(28.29677320830524) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.481308950078528) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.29677320830524) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:185 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.872220043092966 - timing_s/reward:0.0002188337966799736 - timing_s/old_log_prob:9.84437461849302 - timing_s/ref:18.768223751336336 - timing_s/adv:0.07295196503400803 - timing_s/update_actor:19.65120636858046 - timing_s/save_checkpoint:54.27431254833937 - timing_s/update_weights:25.759917741641402 - timing_s/step:158.70478736329824 - timing_s/testing:140.21080948598683 - timing_s/stop_profile:0.00033582746982574463 - timing_per_token_ms/adv:7.315951858824287e-05 - timing_per_token_ms/update_actor:0.019707115455126656 - timing_per_token_ms/gen:0.03642691567691223 - timing_per_token_ms/ref:0.018821620689231686 - perf/total_num_tokens:1484629 - perf/time_per_step:158.70478736329824 - perf/throughput:2338.6644862222543 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1466.0 - frontier/mean_score:2.2994569787999812 - frontier/mean_frontier_pct:0.5801857235341736 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:15.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:2.548151989999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.4716499899999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:1.4947953325699996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.5427683425989995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:2.2931495248999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.256456455099999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:1.9635474446549623 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.7180583325012995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.22613 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.4517625899999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.7784348316299994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.140373813 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:1.8556450069999992 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.54235199 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0842919899999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:1.7680699999999998 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.3234489999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.4023500099999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.0477148929592994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:3.047410171186489 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.950082588140999 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:2.440826192003039 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:176.0 - frontier/cluster_42/score:1.4302669749284898 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.1991456050715095 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.372949524899999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.8774972375612995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:2.371752475099999 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.6310575869999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:1.5821542999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.416408186999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.945637926569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:75.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.029161942116779384 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01684207692200455 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0171069603131634 - cluster/prob_snapshot/cluster_8:0.0176559802095917 - cluster/prob_snapshot/cluster_9:0.02624360476639149 - cluster/prob_snapshot/cluster_10:0.048712375577807734 - cluster/prob_snapshot/cluster_11:0.024204673858496662 - cluster/prob_snapshot/cluster_12:0.015460816497057724 - cluster/prob_snapshot/cluster_13:0.033063378139363934 - cluster/prob_snapshot/cluster_14:0.022471523342914135 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.019662060129173622 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.036920959460195114 - cluster/prob_snapshot/cluster_19:0.02805882812887669 - cluster/prob_snapshot/cluster_20:0.03179737945507628 - cluster/prob_snapshot/cluster_21:0.024495186114457956 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.021236650127539947 - cluster/prob_snapshot/cluster_24:0.040539914453705056 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02385336612783719 - cluster/prob_snapshot/cluster_31:0.020234411134327248 - cluster/prob_snapshot/cluster_32:0.015146012986654524 - cluster/prob_snapshot/cluster_33:0.02749333329047785 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03487911460765164 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.020407521799143218 - cluster/prob_snapshot/cluster_38:0.03487562726516372 - cluster/prob_snapshot/cluster_39:0.03376177638253304 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027933668167224084 - cluster/prob_snapshot/cluster_42:0.01636847523149739 - cluster/prob_snapshot/cluster_43:0.025167791047450493 - cluster/prob_snapshot/cluster_44:0.027156863861630538 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.03293108425724449 - cluster/prob_snapshot/cluster_47:0.027143164405273344 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.030110742749692592 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.02132595042152401 - cluster/prob_snapshot/cluster_52:0.014192137894811418 - cluster/prob_snapshot/cluster_53:0.018106726874017282 - cluster/prob_snapshot/cluster_54:0.027654219982304044 - cluster/prob_snapshot/cluster_55:0.03371091012181906 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.033912791779127455 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 13:50:19,453:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|▉         | 76/800 [2:19:42<30:03:26, 149.46s/it]
[36m(TaskRunner pid=2823680)[0m step:76 - global_seqlen/min:295011 - global_seqlen/max:407369 - global_seqlen/minmax_diff:112358 - global_seqlen/balanced_min:351622 - global_seqlen/balanced_max:351820 - global_seqlen/mean:351712.75 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.2705820448392508) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01065097562968731 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.013366269697144162) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005127124394657214) - actor/ppo_kl:np.float64(8.922520505528117e-05) - actor/pg_clipfrac_lower:np.float64(9.150273250938602e-07) - actor/grad_norm:np.float64(0.23469115793704987) - perf/mfu/actor:np.float64(0.19024462088012412) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.45737838745117) - actor/lr:np.float64(1e-06) - training/global_step:76 - training/epoch:0 - critic/score/mean:0.5657216310501099 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5576118230819702 - critic/rewards/max:1.00205397605896 - critic/rewards/min:-0.059088222682476044 - critic/advantages/mean:-0.17906010150909424 - critic/advantages/max:2.4748482704162598 - critic/advantages/min:-2.474853754043579 - critic/returns/mean:-0.17906010150909424 - critic/returns/max:2.4748482704162598 - critic/returns/min:-2.474853754043579 - response_length/mean:1075.8427734375 - response_length/max:8192.0 - response_length/min:172.0 - response_length/clip_ratio:0.009020618163049221 - response_length_non_aborted/mean:1075.8427734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:172.0 - response_length_non_aborted/clip_ratio:0.009020618163049221 - response/aborted_ratio:0.0 - prompt_length/mean:234.67010498046875 - prompt_length/max:544.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.201040327548981e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2395577067509294) - timing_s/agent_loop/generate_sequences/max:np.float64(27.29367956146598) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.849043447562508) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.29367956146598) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:256 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.108163272961974 - timing_s/reward:0.00019478145986795425 - timing_s/old_log_prob:10.651075177825987 - timing_s/ref:18.641710915602744 - timing_s/adv:0.06822460796684027 - timing_s/update_actor:21.82709210179746 - timing_s/update_weights:26.817087563686073 - timing_s/step:107.50259730406106 - timing_s/stop_profile:5.879346281290054e-05 - timing_per_token_ms/adv:6.708694751094958e-05 - timing_per_token_ms/update_actor:0.021463120504285782 - timing_per_token_ms/gen:0.03486617213663943 - timing_per_token_ms/ref:0.01833085625522661 - perf/total_num_tokens:1406851 - perf/time_per_step:107.50259730406106 - perf/throughput:3271.667464974947 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1497.0 - frontier/mean_score:2.3090763983478304 - frontier/mean_frontier_pct:0.6067514193078111 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:15.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:2.683706392999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.3301549929999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:1.4947953325699996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.5427683425989995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:2.2931495248999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:3.8795195185699987 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.1149890364929096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:1.9635474446549623 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.7180583325012995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.22613 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.4517625899999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.8449043821409994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.140373813 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:1.8556450069999992 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.979646393 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0842919899999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.1376489999999997 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:1.2264142999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.581645006999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.0334004250715094 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:2.433187119830542 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.950082588140999 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:2.440826192003039 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:192.0 - frontier/cluster_42/score:1.3011868824499428 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.439401923550056 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.372949524899999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.8774972375612995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:2.5602267325699994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.741740310899999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:1.5821542999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.416408186999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.945637926569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9742947523409997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:76.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.030585325153338602 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015159341972488955 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.017035694144335693 - cluster/prob_snapshot/cluster_8:0.017582426869699564 - cluster/prob_snapshot/cluster_9:0.026134276099364067 - cluster/prob_snapshot/cluster_10:0.044213616744246835 - cluster/prob_snapshot/cluster_11:0.024103839207451627 - cluster/prob_snapshot/cluster_12:0.01539640802597209 - cluster/prob_snapshot/cluster_13:0.03292563886567888 - cluster/prob_snapshot/cluster_14:0.02237790885225915 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.019580149627774497 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.03676714982470152 - cluster/prob_snapshot/cluster_19:0.02794193739282925 - cluster/prob_snapshot/cluster_20:0.03242244598175772 - cluster/prob_snapshot/cluster_21:0.024393141213602264 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.021148180015631197 - cluster/prob_snapshot/cluster_24:0.045354730026615164 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.023753994995476087 - cluster/prob_snapshot/cluster_31:0.024362087409876033 - cluster/prob_snapshot/cluster_32:0.01397704318029851 - cluster/prob_snapshot/cluster_33:0.029422164874497177 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.034570673812561015 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.020322505766647157 - cluster/prob_snapshot/cluster_38:0.02773023882681216 - cluster/prob_snapshot/cluster_39:0.03362112764005894 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027817298837130376 - cluster/prob_snapshot/cluster_42:0.014829201878713298 - cluster/prob_snapshot/cluster_43:0.027801066914795745 - cluster/prob_snapshot/cluster_44:0.02704373063343776 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.032793896108830406 - cluster/prob_snapshot/cluster_47:0.029178067796898202 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.031246718759406468 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.02123710829204399 - cluster/prob_snapshot/cluster_52:0.014133014632892151 - cluster/prob_snapshot/cluster_53:0.01803129576114284 - cluster/prob_snapshot/cluster_54:0.0275390148100245 - cluster/prob_snapshot/cluster_55:0.033570473283941527 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.033897065766768555 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 13:52:06,774:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|▉         | 77/800 [2:21:25<27:13:38, 135.57s/it]
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 13:52:10,331:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:77 - global_seqlen/min:396716 - global_seqlen/max:438923 - global_seqlen/minmax_diff:42207 - global_seqlen/balanced_min:413098 - global_seqlen/balanced_max:413258 - global_seqlen/mean:413201.25 - frontier/skipped_zero_acc_count:48.0 - actor/entropy:np.float64(0.2607709881849587) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008331159129738808 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02632494250428863) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00025841641677288863) - actor/ppo_kl:np.float64(-1.1297283440114824e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23403419852256774) - perf/mfu/actor:np.float64(0.2861645620223465) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.4095458984375) - actor/lr:np.float64(1e-06) - training/global_step:77 - training/epoch:0 - critic/score/mean:0.518750011920929 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5111659169197083 - critic/rewards/max:1.0026582479476929 - critic/rewards/min:-0.049938417971134186 - critic/advantages/mean:-0.13452596962451935 - critic/advantages/max:2.4748294353485107 - critic/advantages/min:-2.4748575687408447 - critic/returns/mean:-0.13452596962451935 - critic/returns/max:2.4748294353485107 - critic/returns/min:-2.4748575687408447 - response_length/mean:1159.296875 - response_length/max:8192.0 - response_length/min:188.0 - response_length/clip_ratio:0.010937499813735485 - response_length_non_aborted/mean:1159.296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:188.0 - response_length_non_aborted/clip_ratio:0.010937499813735485 - response/aborted_ratio:0.0 - prompt_length/mean:244.6125030517578 - prompt_length/max:684.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.80509614944458e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6423558881506324) - timing_s/agent_loop/generate_sequences/max:np.float64(30.252443701960146) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.723893886513906) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.252443701960146) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.05996316391975 - timing_s/reward:0.00013852771371603012 - timing_s/old_log_prob:8.365874007344246 - timing_s/ref:18.840569182299078 - timing_s/adv:0.05488505773246288 - timing_s/update_actor:16.953951865434647 - timing_s/update_weights:26.353692300617695 - timing_s/step:102.98697100579739 - timing_s/stop_profile:5.0866976380348206e-05 - timing_per_token_ms/adv:6.108507018622428e-05 - timing_per_token_ms/update_actor:0.018869130915050436 - timing_per_token_ms/gen:0.04321040927814509 - timing_per_token_ms/ref:0.020968867272748506 - perf/total_num_tokens:1652805 - perf/time_per_step:102.98697100579739 - perf/throughput:4012.170141179702 - frontier/active_count:35.0 - frontier/completed_count:29.0 - frontier/blacklisted_count:1544.0 - frontier/mean_score:2.341639494539695 - frontier/mean_frontier_pct:0.6105285126244866 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:16.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:2.683706392999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:1.2311084950999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.5427683425989995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.9052046674299996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:3.8795195185699987 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.3804923255450365 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:1.9635474446549623 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.7180583325012995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.1582909999999997 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.4517625899999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.8449043821409994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.3982616691 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.1989515048999992 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.979646393 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.7590043929999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.1376489999999997 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:1.2264142999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.7071515048999992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.4233802975500565 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.7831962999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.950082588140999 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:2.440826192003039 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.439401923550056 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.372949524899999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:2.9142480662929096 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:2.5602267325699994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.819218217629999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:1.40750801 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.416408186999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.945637926569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9742947523409997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:77.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03274514531083207 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015021325235353083 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.018824031455573542 - cluster/prob_snapshot/cluster_9:0.02324628500516854 - cluster/prob_snapshot/cluster_10:0.047335815386934516 - cluster/prob_snapshot/cluster_11:0.029045489966641375 - cluster/prob_snapshot/cluster_12:0.016483644216556088 - cluster/prob_snapshot/cluster_13:0.035250723139392454 - cluster/prob_snapshot/cluster_14:0.02395815226570493 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02096282585046587 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.03853577201986162 - cluster/prob_snapshot/cluster_19:0.029915091489373668 - cluster/prob_snapshot/cluster_20:0.03471199667430609 - cluster/prob_snapshot/cluster_21:0.02926230196153883 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.026830426289267836 - cluster/prob_snapshot/cluster_24:0.04855750977991345 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.021462427708714326 - cluster/prob_snapshot/cluster_31:0.026082446019852247 - cluster/prob_snapshot/cluster_32:0.01496404918568244 - cluster/prob_snapshot/cluster_33:0.0330312099854167 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.029568785987045973 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.021757604376373415 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03599532470452054 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02978165142956622 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.029764273270175827 - cluster/prob_snapshot/cluster_44:0.028953456760692834 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.03555809109799767 - cluster/prob_snapshot/cluster_47:0.031238512754357576 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.03439858787832733 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.022736792678136632 - cluster/prob_snapshot/cluster_52:0.015131034753235342 - cluster/prob_snapshot/cluster_53:0.017173657458888086 - cluster/prob_snapshot/cluster_54:0.02948371603539988 - cluster/prob_snapshot/cluster_55:0.03594109332906923 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.036290748539663874 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 13:53:47,338:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|▉         | 78/800 [2:22:58<24:36:01, 122.66s/it]
[36m(TaskRunner pid=2823680)[0m step:78 - global_seqlen/min:297923 - global_seqlen/max:421163 - global_seqlen/minmax_diff:123240 - global_seqlen/balanced_min:361498 - global_seqlen/balanced_max:361523 - global_seqlen/mean:361513.5 - frontier/skipped_zero_acc_count:49.0 - actor/entropy:np.float64(0.2777991359122097) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010824943892657757 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0027295309701003134) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003316855901630333) - actor/ppo_kl:np.float64(2.51559496604159e-05) - actor/pg_clipfrac_lower:np.float64(8.066723012234433e-07) - actor/grad_norm:np.float64(0.24052268266677856) - perf/mfu/actor:np.float64(0.2571269506617158) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.68894958496094) - actor/lr:np.float64(1e-06) - training/global_step:78 - training/epoch:0 - critic/score/mean:0.5854430198669434 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5779841542243958 - critic/rewards/max:1.0017608404159546 - critic/rewards/min:-0.03866611421108246 - critic/advantages/mean:-0.1019282415509224 - critic/advantages/max:2.474844455718994 - critic/advantages/min:-2.474848747253418 - critic/returns/mean:-0.1019282415509224 - critic/returns/max:2.474844455718994 - critic/returns/min:-2.474848747253418 - response_length/mean:969.4097900390625 - response_length/max:8192.0 - response_length/min:165.0 - response_length/clip_ratio:0.004746835213154554 - response_length_non_aborted/mean:969.4097900390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:165.0 - response_length_non_aborted/clip_ratio:0.004746835213154554 - response/aborted_ratio:0.0 - prompt_length/mean:250.34176635742188 - prompt_length/max:962.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.838957339525223e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2921396475285292) - timing_s/agent_loop/generate_sequences/max:np.float64(26.990445824339986) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.179815532092107) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(26.990445824339986) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.414250845089555 - timing_s/reward:0.00019146129488945007 - timing_s/old_log_prob:7.4591806177049875 - timing_s/ref:16.382238035090268 - timing_s/adv:0.06620595324784517 - timing_s/update_actor:16.47356692235917 - timing_s/update_weights:23.17138018179685 - timing_s/step:92.32695196662098 - timing_s/stop_profile:6.0345977544784546e-05 - timing_per_token_ms/adv:8.588327054539428e-05 - timing_per_token_ms/update_actor:0.02136973694109115 - timing_per_token_ms/gen:0.046377968529543054 - timing_per_token_ms/ref:0.021251263855981087 - perf/total_num_tokens:1446054 - perf/time_per_step:92.32695196662098 - perf/throughput:3915.579278851296 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1593.0 - frontier/mean_score:2.404926714608571 - frontier/mean_frontier_pct:0.6417172487365853 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:18.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:2.683706392999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.5427683425989995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:1.6336432672009997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:3.615663662998999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.3804923255450365 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:1.9635474446549623 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.102640832750909 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.1582909999999997 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:2.4517625899999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.8449043821409994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.3982616691 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.439266053429999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.6857524750999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.7590043929999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.3963542999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.795006053429999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.4233802975500565 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.5482374099999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.950082588140999 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:2.0085783344021273 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.607581346485039 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.372949524899999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.3399736464050362 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:2.6921587127989994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.873452752340999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:1.40750801 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.416408186999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.945637926569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.5820063266387 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:78.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03381576480377348 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.019439492917746767 - cluster/prob_snapshot/cluster_9:0.020584553005138333 - cluster/prob_snapshot/cluster_10:0.04555879598321034 - cluster/prob_snapshot/cluster_11:0.029995147311115997 - cluster/prob_snapshot/cluster_12:0.01702258550527036 - cluster/prob_snapshot/cluster_13:0.036403263797711964 - cluster/prob_snapshot/cluster_14:0.024741476468026827 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.026494108316980555 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.039795719053486854 - cluster/prob_snapshot/cluster_19:0.03089318090622089 - cluster/prob_snapshot/cluster_20:0.035846923391706725 - cluster/prob_snapshot/cluster_21:0.030219048086528463 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.030735719589805977 - cluster/prob_snapshot/cluster_24:0.04644194281013796 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0221641529034776 - cluster/prob_snapshot/cluster_31:0.030195014479481197 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.035218184662241483 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03053555276605526 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01950840533581552 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.037172210496829644 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02530888353631367 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03285655902949779 - cluster/prob_snapshot/cluster_44:0.029900105031811664 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.029484595885843966 - cluster/prob_snapshot/cluster_47:0.033922267385097134 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.036206644177382945 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.023480183895891533 - cluster/prob_snapshot/cluster_52:0.01562575177468734 - cluster/prob_snapshot/cluster_53:0.017735159088093012 - cluster/prob_snapshot/cluster_54:0.030447701408260832 - cluster/prob_snapshot/cluster_55:0.03711620599845779 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04513470019790013 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|▉         | 79/800 [2:24:35<23:01:34, 114.97s/it]
[36m(TaskRunner pid=2823680)[0m step:79 - global_seqlen/min:315910 - global_seqlen/max:393062 - global_seqlen/minmax_diff:77152 - global_seqlen/balanced_min:356837 - global_seqlen/balanced_max:357081 - global_seqlen/mean:356944.5 - frontier/skipped_zero_acc_count:22.0 - actor/entropy:np.float64(0.2523366224793893) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010469404980540276 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04965644719777629) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000268029521547125) - actor/ppo_kl:np.float64(3.5791153941570086e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.20741002900259836) - perf/mfu/actor:np.float64(0.17441537067024804) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.53544998168945) - actor/lr:np.float64(1e-06) - training/global_step:79 - training/epoch:0 - critic/score/mean:0.5471698045730591 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5389462113380432 - critic/rewards/max:1.0046395063400269 - critic/rewards/min:-0.06751745939254761 - critic/advantages/mean:-0.12372317910194397 - critic/advantages/max:2.4748384952545166 - critic/advantages/min:-2.474848508834839 - critic/returns/mean:-0.12372317910194397 - critic/returns/max:2.4748384952545166 - critic/returns/min:-2.474848508834839 - response_length/mean:1070.7239990234375 - response_length/max:8192.0 - response_length/min:163.0 - response_length/clip_ratio:0.00589622650295496 - response_length_non_aborted/mean:1070.7239990234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:163.0 - response_length_non_aborted/clip_ratio:0.00589622650295496 - response/aborted_ratio:0.0 - prompt_length/mean:238.83963012695312 - prompt_length/max:409.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.53314995765686e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.299968322739005) - timing_s/agent_loop/generate_sequences/max:np.float64(27.30137002840638) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.260710706452301) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.30137002840638) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:232 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.86213523708284 - timing_s/reward:0.00013082288205623627 - timing_s/old_log_prob:10.913096156902611 - timing_s/ref:11.545124999247491 - timing_s/adv:0.09639430698007345 - timing_s/update_actor:23.996126102283597 - timing_s/update_weights:20.968951512128115 - timing_s/step:96.80283258669078 - timing_s/stop_profile:5.750451236963272e-05 - timing_per_token_ms/adv:8.680183607538289e-05 - timing_per_token_ms/update_actor:0.021608203530165057 - timing_per_token_ms/gen:0.03178740276382676 - timing_per_token_ms/ref:0.010396236863465877 - perf/total_num_tokens:1427778 - perf/time_per_step:96.80283258669078 - perf/throughput:3687.3352820573923 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1615.0 - frontier/mean_score:2.456470824684919 - frontier/mean_frontier_pct:0.6678403840948359 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:18.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.7785944750999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.9799378398192995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:1.6336432672009997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.0309645640993 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.3804923255450365 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:1.9635474446549623 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.102640832750909 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.1108036999999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.2162338129999997 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.8449043821409994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.3982616691 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.439266053429999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.6857524750999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5313030750999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.5774480099999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.795006053429999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.5963662082850396 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.5482374099999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:2.965057811698699 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:2.0085783344021273 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.607581346485039 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.5610646674299993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.537981552483525 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:2.7845110989592996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.873452752340999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:1.40750801 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:1.9914857308999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.9619465485989993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.4074044286470895 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:79.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.034276748468847706 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.02442451819709941 - cluster/prob_snapshot/cluster_9:0.02015262747388137 - cluster/prob_snapshot/cluster_10:0.049725989052611747 - cluster/prob_snapshot/cluster_11:0.029365759345574526 - cluster/prob_snapshot/cluster_12:0.016665400713067408 - cluster/prob_snapshot/cluster_13:0.03563941436890051 - cluster/prob_snapshot/cluster_14:0.024222326240918836 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02593818262808487 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.038374882307006426 - cluster/prob_snapshot/cluster_19:0.039675468447491404 - cluster/prob_snapshot/cluster_20:0.035094747649730425 - cluster/prob_snapshot/cluster_21:0.029584962012587715 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03009079220133764 - cluster/prob_snapshot/cluster_24:0.0454674518500541 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.018890158605417474 - cluster/prob_snapshot/cluster_31:0.031795405166895586 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.03447920174061345 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03202878010876319 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.019099060603552213 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.036576960660490286 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.024777827410673532 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03216713008194915 - cluster/prob_snapshot/cluster_44:0.031593300211536585 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.03130854684720503 - cluster/prob_snapshot/cluster_47:0.03434973595573268 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.035446920416685096 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.02298749936197367 - cluster/prob_snapshot/cluster_52:0.015297876734851085 - cluster/prob_snapshot/cluster_53:0.0173630223694021 - cluster/prob_snapshot/cluster_54:0.02456697301066286 - cluster/prob_snapshot/cluster_55:0.03653858011102742 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04203374964537446 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|█         | 80/800 [2:26:26<22:45:53, 113.82s/it]
[36m(TaskRunner pid=2823680)[0m step:80 - global_seqlen/min:345268 - global_seqlen/max:439666 - global_seqlen/minmax_diff:94398 - global_seqlen/balanced_min:388915 - global_seqlen/balanced_max:389043 - global_seqlen/mean:388966.25 - frontier/skipped_zero_acc_count:41.0 - actor/entropy:np.float64(0.28934049081395974) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009985308162868023 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0598035278419502) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003600878409418907) - actor/ppo_kl:np.float64(7.196724978413889e-05) - actor/pg_clipfrac_lower:np.float64(3.4172239793406334e-07) - actor/grad_norm:np.float64(0.22588160769505936) - perf/mfu/actor:np.float64(0.2243613935314682) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.4294204711914) - actor/lr:np.float64(1e-06) - training/global_step:80 - training/epoch:0 - critic/score/mean:0.5962643623352051 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5877975225448608 - critic/rewards/max:1.0016766786575317 - critic/rewards/min:-0.07221634685993195 - critic/advantages/mean:-0.10977684706449509 - critic/advantages/max:2.4748096466064453 - critic/advantages/min:-2.4748525619506836 - critic/returns/mean:-0.10977684706449509 - critic/returns/max:2.4748096466064453 - critic/returns/min:-2.4748525619506836 - response_length/mean:1170.2701416015625 - response_length/max:8192.0 - response_length/min:172.0 - response_length/clip_ratio:0.01149425283074379 - response_length_non_aborted/mean:1170.2701416015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:172.0 - response_length_non_aborted/clip_ratio:0.01149425283074379 - response/aborted_ratio:0.0 - prompt_length/mean:238.7586212158203 - prompt_length/max:520.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.11857932806015e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2337173679843545) - timing_s/agent_loop/generate_sequences/max:np.float64(28.473154414445162) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.835196371200254) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.473154414445162) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.978458581492305 - timing_s/reward:0.00024214573204517365 - timing_s/old_log_prob:9.889409921132028 - timing_s/ref:20.487044448032975 - timing_s/adv:0.0715879974886775 - timing_s/update_actor:20.352261235937476 - timing_s/update_weights:28.41688345465809 - timing_s/step:110.95311335101724 - timing_s/stop_profile:6.501469761133194e-05 - timing_per_token_ms/adv:7.29980273856589e-05 - timing_per_token_ms/update_actor:0.020753128669313944 - timing_per_token_ms/gen:0.038033338630795895 - timing_per_token_ms/ref:0.02089056663311829 - perf/total_num_tokens:1555865 - perf/time_per_step:110.95311335101724 - perf/throughput:3505.681258077414 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1656.0 - frontier/mean_score:2.4622933077121187 - frontier/mean_frontier_pct:0.6956175121181277 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:18.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.7785944750999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:1.6859564878735096 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:1.6336432672009997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.0309645640993 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.3804923255450365 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:1.9635474446549623 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.102640832750909 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.0775625899999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.7513636690999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.8449043821409994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.97878316837 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.607486237400999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.4800267325699994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:1.3719121525699998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.7042136069999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.795006053429999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.5963662082850396 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.9837661869999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:2.9755404681890893 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:1.706004834081489 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.725306942539527 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.6927452672009995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:2.6765870867384676 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:2.8491577692715095 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:2.873452752340999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.8634480099999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:1.2852556069999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:1.9914857308999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.9619465485989993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.4074044286470895 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:80.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.034195695661059795 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.020748783413253175 - cluster/prob_snapshot/cluster_9:0.020104973390164926 - cluster/prob_snapshot/cluster_10:0.04960840406533069 - cluster/prob_snapshot/cluster_11:0.02929631935041436 - cluster/prob_snapshot/cluster_12:0.016625992730074726 - cluster/prob_snapshot/cluster_13:0.0355551393214834 - cluster/prob_snapshot/cluster_14:0.024165048709044255 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.025876847681660976 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.03787504605245263 - cluster/prob_snapshot/cluster_19:0.0461674027973742 - cluster/prob_snapshot/cluster_20:0.03501176055315082 - cluster/prob_snapshot/cluster_21:0.0243525522026286 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03208989530987875 - cluster/prob_snapshot/cluster_24:0.04282810467872727 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016883892508749325 - cluster/prob_snapshot/cluster_31:0.03328030280638225 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.034397670199956876 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03195304297867349 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02441387737622693 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.036619476929500315 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.020995515043787695 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03353989494515966 - cluster/prob_snapshot/cluster_44:0.033139163874084436 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.03294030786019513 - cluster/prob_snapshot/cluster_47:0.03506410627439478 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.03536310055174609 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.022933141774088568 - cluster/prob_snapshot/cluster_52:0.015261702479183865 - cluster/prob_snapshot/cluster_53:0.015817424952614192 - cluster/prob_snapshot/cluster_54:0.024508880506842842 - cluster/prob_snapshot/cluster_55:0.03645217884361216 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.041934354178102866 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|█         | 81/800 [2:28:17<22:35:44, 113.14s/it]
[36m(TaskRunner pid=2823680)[0m step:81 - global_seqlen/min:336849 - global_seqlen/max:485207 - global_seqlen/minmax_diff:148358 - global_seqlen/balanced_min:387619 - global_seqlen/balanced_max:387817 - global_seqlen/mean:387709.0 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.2581551462492865) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010644916445016861 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04508591154541364) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004250640876426073) - actor/ppo_kl:np.float64(8.642687487652083e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22850284725427628) - perf/mfu/actor:np.float64(0.21586430982829358) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.34864807128906) - actor/lr:np.float64(1e-06) - training/global_step:81 - training/epoch:0 - critic/score/mean:0.5480769276618958 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5393847227096558 - critic/rewards/max:1.0063592195510864 - critic/rewards/min:-0.050557974725961685 - critic/advantages/mean:-0.16195431351661682 - critic/advantages/max:2.474848747253418 - critic/advantages/min:-2.474850654602051 - critic/returns/mean:-0.16195431351661682 - critic/returns/max:2.474848747253418 - critic/returns/min:-2.474850654602051 - response_length/mean:1169.3187255859375 - response_length/max:8192.0 - response_length/min:142.0 - response_length/clip_ratio:0.008241758681833744 - response_length_non_aborted/mean:1169.3187255859375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:142.0 - response_length_non_aborted/clip_ratio:0.008241758681833744 - response/aborted_ratio:0.0 - prompt_length/mean:242.64834594726562 - prompt_length/max:879.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.357875049114227e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1303031360730529) - timing_s/agent_loop/generate_sequences/max:np.float64(28.27517866063863) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.658030468614015) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.27517866063863) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:217 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.740406543016434 - timing_s/reward:0.00012273527681827545 - timing_s/old_log_prob:9.895710662007332 - timing_s/ref:21.463788479566574 - timing_s/adv:0.09230383671820164 - timing_s/update_actor:21.11603849940002 - timing_s/update_weights:27.605486864224076 - timing_s/step:111.2839524699375 - timing_s/stop_profile:5.65527006983757e-05 - timing_per_token_ms/adv:8.979741137198675e-05 - timing_per_token_ms/update_actor:0.020542651996863562 - timing_per_token_ms/gen:0.03611148426694472 - timing_per_token_ms/ref:0.02088095914783228 - perf/total_num_tokens:1550836 - perf/time_per_step:111.2839524699375 - perf/throughput:3483.961446325665 - frontier/active_count:32.0 - frontier/completed_count:32.0 - frontier/blacklisted_count:1693.0 - frontier/mean_score:2.4692844886939076 - frontier/mean_frontier_pct:0.7355632594251688 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:19.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.7785944750999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:1.6859564878735096 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:1.6336432672009997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:4.32167519486951 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:160.0 - frontier/cluster_11/score:1.9663446278815255 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:1.6744832112584735 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.102640832750909 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.0542938129999992 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.7513636690999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:2.2914330674986996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.97878316837 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.607486237400999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.4800267325699994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:1.3719121525699998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.7042136069999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:2.256504237400999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.7174563457995276 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.2886363308999993 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:2.9755404681890893 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:1.706004834081489 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.725306942539527 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.1849216870406996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:2.773610960716927 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:2.8491577692715095 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:2.911416926638699 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.2044136069999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.2940400116299995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:2.9619465485989993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:3.2851831000529623 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:81.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03516446879427936 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.021336601953837547 - cluster/prob_snapshot/cluster_9:0.02067455262193548 - cluster/prob_snapshot/cluster_10:0.054692908191840695 - cluster/prob_snapshot/cluster_11:0.024885050670609383 - cluster/prob_snapshot/cluster_12:0.01709701151646373 - cluster/prob_snapshot/cluster_13:0.03656242585438338 - cluster/prob_snapshot/cluster_14:0.021191402040315423 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02660994564389823 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.03865357843264756 - cluster/prob_snapshot/cluster_19:0.04747533757091803 - cluster/prob_snapshot/cluster_20:0.028999203488784722 - cluster/prob_snapshot/cluster_21:0.025042466469414496 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03299901056029432 - cluster/prob_snapshot/cluster_24:0.04404143625035877 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0173622176643118 - cluster/prob_snapshot/cluster_31:0.03422314261709414 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.02855716210167404 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03439073593790431 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.028963809422564504 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03765691642929829 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021590323556944825 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.034490089070055865 - cluster/prob_snapshot/cluster_44:0.027651250000819853 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.03510139998824098 - cluster/prob_snapshot/cluster_47:0.03605748171075625 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.03684540172427959 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.027897929758262587 - cluster/prob_snapshot/cluster_52:0.015694070560698294 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.029032195638727807 - cluster/prob_snapshot/cluster_55:0.03748487874423795 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04157559501414786 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 14:00:55,561:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|█         | 82/800 [2:29:59<21:51:37, 109.61s/it]
[36m(TaskRunner pid=2823680)[0m step:82 - global_seqlen/min:315196 - global_seqlen/max:371919 - global_seqlen/minmax_diff:56723 - global_seqlen/balanced_min:344904 - global_seqlen/balanced_max:345003 - global_seqlen/mean:344966.0 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.2882654651048336) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011473441496491432 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07537420588778332) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00037916495575894045) - actor/ppo_kl:np.float64(1.8319148679250445e-05) - actor/pg_clipfrac_lower:np.float64(4.816885422692297e-06) - actor/grad_norm:np.float64(0.2507406721512477) - perf/mfu/actor:np.float64(0.21154938054934402) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.74072265625) - actor/lr:np.float64(1e-06) - training/global_step:82 - training/epoch:0 - critic/score/mean:0.6223404407501221 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6141642332077026 - critic/rewards/max:1.0077444314956665 - critic/rewards/min:-0.08309103548526764 - critic/advantages/mean:-0.12933959066867828 - critic/advantages/max:2.4748613834381104 - critic/advantages/min:-2.4748566150665283 - critic/returns/mean:-0.12933959066867828 - critic/returns/max:2.4748613834381104 - critic/returns/min:-2.4748566150665283 - response_length/mean:1011.5797729492188 - response_length/max:8192.0 - response_length/min:136.0 - response_length/clip_ratio:0.003989361692219973 - response_length_non_aborted/mean:1011.5797729492188 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:136.0 - response_length_non_aborted/clip_ratio:0.003989361692219973 - response/aborted_ratio:0.0 - prompt_length/mean:238.86170959472656 - prompt_length/max:962.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.462276309728622e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1053174622356892) - timing_s/agent_loop/generate_sequences/max:np.float64(27.00043831858784) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.841583230788274) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.00043831858784) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:214 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.657914183102548 - timing_s/reward:0.00013183895498514175 - timing_s/old_log_prob:8.949431674554944 - timing_s/ref:18.289658222347498 - timing_s/adv:0.0847430769354105 - timing_s/update_actor:19.04456263780594 - timing_s/update_weights:25.72972321230918 - timing_s/step:101.15091451164335 - timing_s/stop_profile:5.4533593356609344e-05 - timing_per_token_ms/adv:9.012037975460848e-05 - timing_per_token_ms/update_actor:0.02025301982470653 - timing_per_token_ms/gen:0.03767268673801583 - timing_per_token_ms/ref:0.01945021356536574 - perf/total_num_tokens:1379864 - perf/time_per_step:101.15091451164335 - perf/throughput:3410.4091066847586 - frontier/active_count:28.0 - frontier/completed_count:36.0 - frontier/blacklisted_count:1725.0 - frontier/mean_score:2.489908809765178 - frontier/mean_frontier_pct:0.7251959781568854 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:19.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.7785944750999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:1.6859564878735096 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:1.6336432672009997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.525172636408657 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:160.0 - frontier/cluster_11/score:1.9663446278815255 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:2.371848582925636 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.038005669099999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.5259545683699995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.97878316837 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.607486237400999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:1.3719121525699998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.7042136069999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:1.8795529661806991 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.7174563457995276 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.2886363308999993 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:2.9755404681890893 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.0942033838570424 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.725306942539527 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.1849216870406996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.841527672501849 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.4944104384900565 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:2.9379918486470893 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.2044136069999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.2401 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.505828008140999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:3.1996281700370734 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:82.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0398550808682893 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.024182705596935833 - cluster/prob_snapshot/cluster_9:0.023432345060676395 - cluster/prob_snapshot/cluster_10:0.06490732022367036 - cluster/prob_snapshot/cluster_11:0.028204484267652624 - cluster/prob_snapshot/cluster_12:0.01937759334802145 - cluster/prob_snapshot/cluster_13:0.041439512358120476 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.034020875635850586 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.04357597436594062 - cluster/prob_snapshot/cluster_19:0.050574924019901456 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.028382897865424497 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03740077070830142 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.019678175521780087 - cluster/prob_snapshot/cluster_31:0.038788190561069365 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.02695957835324957 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.038978139267331666 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.03282731138477545 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.04267999776479639 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.030038440645522302 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03909074518040783 - cluster/prob_snapshot/cluster_44:0.03133966878155267 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0407577702294853 - cluster/prob_snapshot/cluster_47:0.050122467262158965 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.042141414937489366 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.03161925331726587 - cluster/prob_snapshot/cluster_52:0.017787513157344354 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.03594262451806307 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04589422479892273 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|█         | 83/800 [2:31:52<22:02:42, 110.69s/it]
[36m(TaskRunner pid=2823680)[0m step:83 - global_seqlen/min:361868 - global_seqlen/max:401480 - global_seqlen/minmax_diff:39612 - global_seqlen/balanced_min:386131 - global_seqlen/balanced_max:386306 - global_seqlen/mean:386230.25 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.25191434567832216) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009359250776469707 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.11403350674481771) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008725754139179896) - actor/ppo_kl:np.float64(8.612486688554215e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2296357384094825) - perf/mfu/actor:np.float64(0.20411695596156398) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.23854446411133) - actor/lr:np.float64(1e-06) - training/global_step:83 - training/epoch:0 - critic/score/mean:0.5567010045051575 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.548674464225769 - critic/rewards/max:1.0069185495376587 - critic/rewards/min:-0.04831147938966751 - critic/advantages/mean:-0.16909000277519226 - critic/advantages/max:2.474811315536499 - critic/advantages/min:-2.474851369857788 - critic/returns/mean:-0.16909000277519226 - critic/returns/max:2.474811315536499 - critic/returns/min:-2.474851369857788 - response_length/mean:1187.6829833984375 - response_length/max:8192.0 - response_length/min:109.0 - response_length/clip_ratio:0.015463917516171932 - response_length_non_aborted/mean:1187.6829833984375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:109.0 - response_length_non_aborted/clip_ratio:0.015463917516171932 - response/aborted_ratio:0.0 - prompt_length/mean:242.6907196044922 - prompt_length/max:657.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.2250294983387e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.936527444049716) - timing_s/agent_loop/generate_sequences/max:np.float64(28.610685526393354) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.844646351888514) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.610685526393354) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:194 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.375422975979745 - timing_s/reward:0.0002014385536313057 - timing_s/old_log_prob:10.291408026590943 - timing_s/ref:22.21597164310515 - timing_s/adv:0.0834099967032671 - timing_s/update_actor:22.142006983049214 - timing_s/update_weights:27.49041197542101 - timing_s/step:112.99296081159264 - timing_s/stop_profile:6.285961717367172e-05 - timing_per_token_ms/adv:7.514617215174023e-05 - timing_per_token_ms/update_actor:0.019948293181842044 - timing_per_token_ms/gen:0.03295794134379699 - timing_per_token_ms/ref:0.02001492981171126 - perf/total_num_tokens:1544921 - perf/time_per_step:112.99296081159264 - perf/throughput:3418.1797452321853 - frontier/active_count:25.0 - frontier/completed_count:39.0 - frontier/blacklisted_count:1755.0 - frontier/mean_score:2.567789988528026 - frontier/mean_frontier_pct:0.7411350928133273 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:20.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:2.2450161325699995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:160.0 - frontier/cluster_8/score:1.4801695415114566 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.667620845486059 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:160.0 - frontier/cluster_11/score:1.9663446278815255 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.3509563309 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:2.5602940080479453 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.626603968369999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:112.0 - frontier/cluster_19/score:3.3681681978589992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.97878316837 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.7252403661806994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:1.3719121525699998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.792949524899999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.215687076326489 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.2886363308999993 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:128.0 - frontier/cluster_39/score:2.9828783277323625 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.0942033838570424 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.725306942539527 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:1.8294451809284897 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.841527672501849 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.3460873069430392 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:2.9379918486470893 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.2044136069999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.505828008140999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:3.1996281700370734 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:83.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03497195865082323 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.02305748598015147 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.07271032080254744 - cluster/prob_snapshot/cluster_11:0.030630925997319956 - cluster/prob_snapshot/cluster_12:0.021044654538503432 - cluster/prob_snapshot/cluster_13:0.04500456822259266 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.03988323063001928 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.05649377845653076 - cluster/prob_snapshot/cluster_19:0.05246796993378398 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.030824688579837143 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.042452698676388734 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.021371095902690113 - cluster/prob_snapshot/cluster_31:0.043507444726833676 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.03451508240510933 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.035651456561865476 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.04646607925194512 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03262265828924016 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.04245373577613794 - cluster/prob_snapshot/cluster_44:0.028498361456377682 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.04426417557817089 - cluster/prob_snapshot/cluster_47:0.052124002693244686 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.04576685572843564 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.03433946883270885 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.039034781182824906 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04984252114591732 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  10%|█         | 84/800 [2:33:34<21:31:29, 108.23s/it]
[36m(TaskRunner pid=2823680)[0m step:84 - global_seqlen/min:329853 - global_seqlen/max:384078 - global_seqlen/minmax_diff:54225 - global_seqlen/balanced_min:358171 - global_seqlen/balanced_max:358351 - global_seqlen/mean:358245.0 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.27315893537468383) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010975871235132217 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.011580578339049907) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012572805595280241) - actor/ppo_kl:np.float64(0.00010791712539154711) - actor/pg_clipfrac_lower:np.float64(4.985689237299892e-06) - actor/grad_norm:np.float64(0.2436983324587345) - perf/mfu/actor:np.float64(0.2190767472425543) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.09449768066406) - actor/lr:np.float64(1e-06) - training/global_step:84 - training/epoch:0 - critic/score/mean:0.605555534362793 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5976282358169556 - critic/rewards/max:1.0027769804000854 - critic/rewards/min:-0.04737677797675133 - critic/advantages/mean:-0.12618835270404816 - critic/advantages/max:2.4748270511627197 - critic/advantages/min:-2.4748494625091553 - critic/returns/mean:-0.12618835270404816 - critic/returns/max:2.4748270511627197 - critic/returns/min:-2.4748494625091553 - response_length/mean:1004.584716796875 - response_length/max:8192.0 - response_length/min:109.0 - response_length/clip_ratio:0.0055555556900799274 - response_length_non_aborted/mean:1004.584716796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:109.0 - response_length_non_aborted/clip_ratio:0.0055555556900799274 - response/aborted_ratio:0.0 - prompt_length/mean:235.35556030273438 - prompt_length/max:404.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.462928235530853e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3330205865204334) - timing_s/agent_loop/generate_sequences/max:np.float64(28.186269742436707) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.604910075423504) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.186269742436707) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.9840668477118 - timing_s/reward:0.00011540018022060394 - timing_s/old_log_prob:9.070829045027494 - timing_s/ref:18.510609734803438 - timing_s/adv:0.09409761615097523 - timing_s/update_actor:19.186108843423426 - timing_s/update_weights:25.015245711430907 - timing_s/step:102.23309941403568 - timing_s/stop_profile:5.568843334913254e-05 - timing_per_token_ms/adv:0.00010540115188228738 - timing_per_token_ms/update_actor:0.021490852318630294 - timing_per_token_ms/gen:0.04145448001276343 - timing_per_token_ms/ref:0.020734208451799803 - perf/total_num_tokens:1432980 - perf/time_per_step:102.23309941403568 - perf/throughput:3504.1977799101746 - frontier/active_count:22.0 - frontier/completed_count:42.0 - frontier/blacklisted_count:1791.0 - frontier/mean_score:2.4643270202500864 - frontier/mean_frontier_pct:0.7454618840799747 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:20.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:2.2450161325699995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:160.0 - frontier/cluster_8/score:1.4801695415114566 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.167334591840241 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:1.6764412395170678 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.8456694316299997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.438622777858999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.97878316837 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.2076682563264893 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.2603385067989998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.855064667429999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.215687076326489 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.2886363308999993 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.3880148294126533 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.8077148597776684 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:1.5806116266499428 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.841527672501849 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.2422611148601272 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:2.9565942940529624 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.2044136069999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.505828008140999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:3.1397397190259513 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:84.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.041409353144102996 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.027301747354228342 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.07686654330880104 - cluster/prob_snapshot/cluster_11:0.03092198149731358 - cluster/prob_snapshot/cluster_12:0.03404339780585497 - cluster/prob_snapshot/cluster_13:0.05328869558706705 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0634254439743007 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03649868249313592 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0407204710580846 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02324696092468902 - cluster/prob_snapshot/cluster_31:0.05266170667892673 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0408683783022071 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.04221392829644055 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.04404696605511817 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.051788338831838636 - cluster/prob_snapshot/cluster_44:0.029154402982706305 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.05241201662309271 - cluster/prob_snapshot/cluster_47:0.05980355042568937 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.054534422024899026 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.04066043900692667 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.04622003174146877 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0579125418831076 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  11%|█         | 85/800 [2:35:25<21:37:35, 108.89s/it]
[36m(TaskRunner pid=2823680)[0m step:85 - global_seqlen/min:356815 - global_seqlen/max:417840 - global_seqlen/minmax_diff:61025 - global_seqlen/balanced_min:389483 - global_seqlen/balanced_max:389606 - global_seqlen/mean:389550.75 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.28679013019427657) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010617073625326157 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03953139251098037) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009341665621226033) - actor/ppo_kl:np.float64(0.00014001347034782916) - actor/pg_clipfrac_lower:np.float64(3.5342915983468024e-06) - actor/grad_norm:np.float64(0.2468883271018664) - perf/mfu/actor:np.float64(0.21945822909842877) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.88168334960938) - actor/lr:np.float64(1e-06) - training/global_step:85 - training/epoch:0 - critic/score/mean:0.5455729365348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5370733737945557 - critic/rewards/max:1.0005519390106201 - critic/rewards/min:-0.05643108859658241 - critic/advantages/mean:-0.15175406634807587 - critic/advantages/max:2.4748425483703613 - critic/advantages/min:-2.474846363067627 - critic/returns/mean:-0.15175406634807587 - critic/returns/max:2.4748425483703613 - critic/returns/min:-2.474846363067627 - response_length/mean:1158.0416259765625 - response_length/max:8192.0 - response_length/min:63.0 - response_length/clip_ratio:0.0078125 - response_length_non_aborted/mean:1158.0416259765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:63.0 - response_length_non_aborted/clip_ratio:0.0078125 - response/aborted_ratio:0.0 - prompt_length/mean:243.4895782470703 - prompt_length/max:543.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.120162576436996e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.6329905744642019) - timing_s/agent_loop/generate_sequences/max:np.float64(28.669544112868607) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.806037334273242) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.669544112868607) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:242 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.436789290048182 - timing_s/reward:0.00014337804168462753 - timing_s/old_log_prob:9.931637830100954 - timing_s/ref:20.442673566751182 - timing_s/adv:0.08891045767813921 - timing_s/update_actor:20.876624742522836 - timing_s/update_weights:28.032489771023393 - timing_s/step:110.21740098576993 - timing_s/stop_profile:5.4681673645973206e-05 - timing_per_token_ms/adv:8.260167235068342e-05 - timing_per_token_ms/update_actor:0.019395290068268744 - timing_per_token_ms/gen:0.03422263394790075 - timing_per_token_ms/ref:0.01899213060004235 - perf/total_num_tokens:1558203 - perf/time_per_step:110.21740098576993 - perf/throughput:3534.3851925005433 - frontier/active_count:19.0 - frontier/completed_count:45.0 - frontier/blacklisted_count:1822.0 - frontier/mean_score:2.442613438802225 - frontier/mean_frontier_pct:0.7564383300307815 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:21.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:1.8715112927989996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.3361186790580197 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:2.073508867661947 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:2.1919686021409994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.438622777858999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.97878316837 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.2076682563264893 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:1.8509809534285424 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.2886363308999993 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.571610380588857 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.8654004018443677 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:1.5806116266499428 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.8890693707512938 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.169582780402089 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:2.9696160058370733 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.4430895248999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.6540796056986995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:3.1397397190259513 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:85.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.040325903719805725 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.02878967036813624 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.044678394023711314 - cluster/prob_snapshot/cluster_12:0.047230874399147166 - cluster/prob_snapshot/cluster_13:0.06225120553053586 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.07409283160738352 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0426373166098928 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0475691586219324 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.03988353156208831 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.04931379718873804 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.055411107061707085 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.06174147127408152 - cluster/prob_snapshot/cluster_44:0.03405781868372447 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0622514722369185 - cluster/prob_snapshot/cluster_47:0.06829576210747249 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.06398702994577189 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.052641837288963836 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.05718809127998167 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.06765272649000714 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  11%|█         | 86/800 [2:37:17<21:47:23, 109.86s/it]
[36m(TaskRunner pid=2823680)[0m step:86 - global_seqlen/min:337662 - global_seqlen/max:413498 - global_seqlen/minmax_diff:75836 - global_seqlen/balanced_min:379734 - global_seqlen/balanced_max:379923 - global_seqlen/mean:379830.75 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.2795722276593248) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010301928967237473 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05001362768962281) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005481732969580359) - actor/ppo_kl:np.float64(7.006793982651989e-05) - actor/pg_clipfrac_lower:np.float64(1.1452988625630194e-06) - actor/grad_norm:np.float64(0.23949750264485678) - perf/mfu/actor:np.float64(0.2087692511817063) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.29824447631836) - actor/lr:np.float64(1e-06) - training/global_step:86 - training/epoch:0 - critic/score/mean:0.5776315927505493 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5692306756973267 - critic/rewards/max:1.003442645072937 - critic/rewards/min:-0.0385354608297348 - critic/advantages/mean:-0.14367729425430298 - critic/advantages/max:2.4748358726501465 - critic/advantages/min:-2.474839448928833 - critic/returns/mean:-0.14367729425430298 - critic/returns/max:2.4748358726501465 - critic/returns/min:-2.474839448928833 - response_length/mean:1142.800048828125 - response_length/max:8192.0 - response_length/min:187.0 - response_length/clip_ratio:0.010526316240429878 - response_length_non_aborted/mean:1142.800048828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:187.0 - response_length_non_aborted/clip_ratio:0.010526316240429878 - response/aborted_ratio:0.0 - prompt_length/mean:236.73684692382812 - prompt_length/max:543.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.120162576436996e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.616981402039528) - timing_s/agent_loop/generate_sequences/max:np.float64(29.08905779849738) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.8164627300666325) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.08905779849738) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:223 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.73550971969962 - timing_s/reward:0.0001423899084329605 - timing_s/old_log_prob:9.795904665254056 - timing_s/ref:21.4575125137344 - timing_s/adv:0.08420154638588428 - timing_s/update_actor:21.295363523997366 - timing_s/update_weights:27.150101366452873 - timing_s/step:111.90649394039065 - timing_s/stop_profile:5.1110051572322845e-05 - timing_per_token_ms/adv:8.031065573675021e-05 - timing_per_token_ms/update_actor:0.02031132066063111 - timing_per_token_ms/gen:0.03653942039830566 - timing_per_token_ms/ref:0.020465976866505923 - perf/total_num_tokens:1519323 - perf/time_per_step:111.90649394039065 - perf/throughput:3394.179699726138 - frontier/active_count:16.0 - frontier/completed_count:48.0 - frontier/blacklisted_count:1854.0 - frontier/mean_score:2.370940153051964 - frontier/mean_frontier_pct:0.7597913907365019 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:21.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:1.6100579049592998 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.3361186790580197 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:2.073508867661947 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:2.1919686021409994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.307035944501299 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.2851482178589997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.445367779428542 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:2.1956866673999795 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.502045431629999 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.7001272664122 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:2.305780281291057 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:1.40642813865496 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:2.978731204085951 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.6101626674299996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:3.097817803318166 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:86.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.042442496463026815 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.03522122535806411 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.054659458216207185 - cluster/prob_snapshot/cluster_12:0.05778215762108689 - cluster/prob_snapshot/cluster_13:0.07615800079562045 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.08717628163885634 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.060238451583158645 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0644619755659156 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.05788016898522324 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.06595604670812946 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.07117765243189747 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.060782330332204114 - cluster/prob_snapshot/cluster_44:0.037074642543290064 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.07852188930864659 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.06880610904681891 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.08166111340185393 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  11%|█         | 87/800 [2:39:04<21:36:55, 109.14s/it]
[36m(TaskRunner pid=2823680)[0m step:87 - global_seqlen/min:350519 - global_seqlen/max:424798 - global_seqlen/minmax_diff:74279 - global_seqlen/balanced_min:389074 - global_seqlen/balanced_max:389248 - global_seqlen/mean:389169.5 - frontier/skipped_zero_acc_count:21.0 - actor/entropy:np.float64(0.2579420743579114) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008748020976781845 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04907015064964071) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00039605608017828436) - actor/ppo_kl:np.float64(-3.007687629027108e-06) - actor/pg_clipfrac_lower:np.float64(1.6459078122150256e-06) - actor/grad_norm:np.float64(0.21829060731189592) - perf/mfu/actor:np.float64(0.1884900189909448) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.92429733276367) - actor/lr:np.float64(1e-06) - training/global_step:87 - training/epoch:0 - critic/score/mean:0.5 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4916289746761322 - critic/rewards/max:1.0028328895568848 - critic/rewards/min:-0.03942566737532616 - critic/advantages/mean:-0.13659416139125824 - critic/advantages/max:2.4748406410217285 - critic/advantages/min:-2.4748423099517822 - critic/returns/mean:-0.13659416139125824 - critic/returns/max:2.4748406410217285 - critic/returns/min:-2.4748423099517822 - response_length/mean:1262.3773193359375 - response_length/max:8192.0 - response_length/min:234.0 - response_length/clip_ratio:0.005841121543198824 - response_length_non_aborted/mean:1262.3773193359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:234.0 - response_length_non_aborted/clip_ratio:0.005841121543198824 - response/aborted_ratio:0.0 - prompt_length/mean:233.89720153808594 - prompt_length/max:549.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.281374514102936e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7277882769703865) - timing_s/agent_loop/generate_sequences/max:np.float64(28.294155304320157) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.041739289177713) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.294155304320157) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:205 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.347494433633983 - timing_s/reward:0.00011170096695423126 - timing_s/old_log_prob:10.02502866834402 - timing_s/ref:17.329727127216756 - timing_s/adv:0.11990532279014587 - timing_s/update_actor:24.121637783944607 - timing_s/update_weights:24.864095489494503 - timing_s/step:107.23022588621825 - timing_s/stop_profile:6.669946014881134e-05 - timing_per_token_ms/adv:9.361671846208837e-05 - timing_per_token_ms/update_actor:0.018833096986163148 - timing_per_token_ms/gen:0.02808405964642996 - timing_per_token_ms/ref:0.013530276619436244 - perf/total_num_tokens:1556678 - perf/time_per_step:107.23022588621825 - perf/throughput:3629.2891932629786 - frontier/active_count:14.0 - frontier/completed_count:50.0 - frontier/blacklisted_count:1874.0 - frontier/mean_score:2.4523435669670457 - frontier/mean_frontier_pct:0.7696128025267609 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:21.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:2.0270405334715096 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.8352830753406137 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:2.4343780214986994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.214925161150909 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.4996037525012995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:160.0 - frontier/cluster_23/score:2.6117574455999795 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.0514318021409994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.79008908648854 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:2.51404619690374 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:1.8844996970584718 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:2.3851118428601654 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.1271138672009995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.068472462322716 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:87.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.05904091558946924 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.05345566176143249 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.07090529513685423 - cluster/prob_snapshot/cluster_13:0.08414857386435988 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.09364002442560898 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0728051026714331 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0760717648905384 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.05975135172079273 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0812659695365606 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0732257628046618 - cluster/prob_snapshot/cluster_44:0.05488917745115895 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.06947033601966682 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.06195571731736604 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.08937434681009679 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 14:11:25,616:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  11%|█         | 88/800 [2:40:55<21:41:47, 109.70s/it]
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 14:11:29,328:WARNING: Error in configuration: macro '\frac' failed its substitution![32m [repeated 2x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:88 - global_seqlen/min:325357 - global_seqlen/max:400380 - global_seqlen/minmax_diff:75023 - global_seqlen/balanced_min:355839 - global_seqlen/balanced_max:355940 - global_seqlen/mean:355903.5 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.2926642530958872) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010058391839265823 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.011855931585159851) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00036809194668366014) - actor/ppo_kl:np.float64(3.0845278815275155e-05) - actor/pg_clipfrac_lower:np.float64(1.410962609878304e-06) - actor/grad_norm:np.float64(0.22688368421334487) - perf/mfu/actor:np.float64(0.19040046298115826) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.67439651489258) - actor/lr:np.float64(1e-06) - training/global_step:88 - training/epoch:0 - critic/score/mean:0.5283505320549011 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.520031750202179 - critic/rewards/max:1.000998854637146 - critic/rewards/min:-0.05970515310764313 - critic/advantages/mean:-0.1252654641866684 - critic/advantages/max:2.474844455718994 - critic/advantages/min:-2.474846124649048 - critic/returns/mean:-0.1252654641866684 - critic/returns/max:2.474844455718994 - critic/returns/min:-2.474846124649048 - response_length/mean:1135.4097900390625 - response_length/max:8192.0 - response_length/min:152.0 - response_length/clip_ratio:0.012886597774922848 - response_length_non_aborted/mean:1135.4097900390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:152.0 - response_length_non_aborted/clip_ratio:0.012886597774922848 - response/aborted_ratio:0.0 - prompt_length/mean:228.5360870361328 - prompt_length/max:500.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.707772940397263e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0903692366555333) - timing_s/agent_loop/generate_sequences/max:np.float64(28.774654803797603) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.329984671066086) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.774654803797603) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:210 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.126671590842307 - timing_s/reward:0.00015431735664606094 - timing_s/old_log_prob:11.378754099830985 - timing_s/ref:19.427031298168004 - timing_s/adv:0.08033395931124687 - timing_s/update_actor:21.833259532228112 - timing_s/update_weights:26.555502169765532 - timing_s/step:110.78313975967467 - timing_s/stop_profile:5.242787301540375e-05 - timing_per_token_ms/adv:7.589974444148636e-05 - timing_per_token_ms/update_actor:0.020628123312089235 - timing_per_token_ms/gen:0.035327940989154545 - timing_per_token_ms/ref:0.018354712296388402 - perf/total_num_tokens:1423614 - perf/time_per_step:110.78313975967467 - perf/throughput:3212.614309109424 - frontier/active_count:12.0 - frontier/completed_count:52.0 - frontier/blacklisted_count:1905.0 - frontier/mean_score:2.4218004910671715 - frontier/mean_frontier_pct:0.7901314026062661 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:21.0 - frontier/replay_slots_count:16.0 - frontier/replay_pool_size:3013.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:192.0 - frontier/cluster_8/score:2.1846981527384295 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:2.604064615049089 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:2.550447612805636 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.0497226267509094 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.1282302119199854 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:1.7360022614986996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.853062360541978 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:208.0 - frontier/cluster_43/score:2.6598323378326176 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:2.5695782900021156 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:1.7889797070406996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.047930723625901 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:88.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.07517472230532284 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.08960498000881989 - cluster/prob_snapshot/cluster_13:0.0994114710541567 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.08776003715049266 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.07053026024478228 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.07323167961777396 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.05973524890200343 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.09817290796198665 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.09152392843132716 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.08841831726299033 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.06155818483119495 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.10487826222914917 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  11%|█         | 89/800 [2:42:45<21:39:41, 109.68s/it]
[36m(TaskRunner pid=2823680)[0m step:89 - global_seqlen/min:290188 - global_seqlen/max:453467 - global_seqlen/minmax_diff:163279 - global_seqlen/balanced_min:372892 - global_seqlen/balanced_max:372973 - global_seqlen/mean:372936.0 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.2648724878284459) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01055210456252098 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04241706312313909) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002694329612419703) - actor/ppo_kl:np.float64(2.6169550305728723e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2421081898113092) - perf/mfu/actor:np.float64(0.20671369001621911) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.30546951293945) - actor/lr:np.float64(1e-06) - training/global_step:89 - training/epoch:0 - critic/score/mean:0.5065104365348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4976315498352051 - critic/rewards/max:1.0052913427352905 - critic/rewards/min:-0.07699555158615112 - critic/advantages/mean:-0.12875968217849731 - critic/advantages/max:2.4748497009277344 - critic/advantages/min:-2.474853754043579 - critic/returns/mean:-0.12875968217849731 - critic/returns/max:2.4748497009277344 - critic/returns/min:-2.474853754043579 - response_length/mean:1137.4361572265625 - response_length/max:8192.0 - response_length/min:172.0 - response_length/clip_ratio:0.0078125 - response_length_non_aborted/mean:1137.4361572265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:172.0 - response_length_non_aborted/clip_ratio:0.0078125 - response/aborted_ratio:0.0 - prompt_length/mean:229.9791717529297 - prompt_length/max:503.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.173620492219925e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2506398865953088) - timing_s/agent_loop/generate_sequences/max:np.float64(28.312390002422035) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.768322466325117) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.312390002422035) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:237 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.641419479623437 - timing_s/reward:0.00011845491826534271 - timing_s/old_log_prob:10.259662173688412 - timing_s/ref:20.583878107368946 - timing_s/adv:0.08633071277290583 - timing_s/update_actor:21.048689803108573 - timing_s/update_weights:27.406545411795378 - timing_s/step:109.40597965940833 - timing_s/stop_profile:5.84600493311882e-05 - timing_per_token_ms/adv:8.220602544614548e-05 - timing_per_token_ms/update_actor:0.020043030735933127 - timing_per_token_ms/gen:0.03393209953354004 - timing_per_token_ms/ref:0.019600426697806504 - perf/total_num_tokens:1491744 - perf/time_per_step:109.40597965940833 - perf/throughput:3408.735072442903 - frontier/active_count:10.0 - frontier/completed_count:54.0 - frontier/blacklisted_count:1936.0 - frontier/mean_score:2.4114407079843274 - frontier/mean_frontier_pct:0.7796856279594162 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:21.0 - frontier/replay_slots_count:32.0 - frontier/replay_pool_size:3267.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:192.0 - frontier/cluster_8/score:2.4292887069169007 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:2.7228452305343622 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:2.685313328963945 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.3348058387256367 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.3897611483439896 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:1.5152015830490897 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.897143652379384 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:2.6987048030014806 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:176.0 - frontier/cluster_51/score:1.5522857949284896 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:89.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.10074013841076325 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.11291362966209238 - cluster/prob_snapshot/cluster_13:0.11980626284669886 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.11135721977624495 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.09682202971008363 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.09910097065341245 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.06283387263191782 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.1201416084080726 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.11191255062029998 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.06437171728041419 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  11%|█▏        | 90/800 [2:44:24<20:58:03, 106.31s/it]
[36m(TaskRunner pid=2823680)[0m step:90 - global_seqlen/min:356839 - global_seqlen/max:389747 - global_seqlen/minmax_diff:32908 - global_seqlen/balanced_min:372950 - global_seqlen/balanced_max:373058 - global_seqlen/mean:373016.5 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.2800710675521539) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0105481231585145 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.055936749136890285) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003974624535378378) - actor/ppo_kl:np.float64(3.28911382762971e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22264678203142607) - perf/mfu/actor:np.float64(0.18124934489493028) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.41128921508789) - actor/lr:np.float64(1e-06) - training/global_step:90 - training/epoch:0 - critic/score/mean:0.5228365659713745 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5139160752296448 - critic/rewards/max:1.0171235799789429 - critic/rewards/min:-0.09592707455158234 - critic/advantages/mean:-0.13062986731529236 - critic/advantages/max:2.4748141765594482 - critic/advantages/min:-2.474856376647949 - critic/returns/mean:-0.13062986731529236 - critic/returns/max:2.4748141765594482 - critic/returns/min:-2.474856376647949 - response_length/mean:1180.6346435546875 - response_length/max:8192.0 - response_length/min:158.0 - response_length/clip_ratio:0.006009615492075682 - response_length_non_aborted/mean:1180.6346435546875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:158.0 - response_length_non_aborted/clip_ratio:0.006009615492075682 - response/aborted_ratio:0.0 - prompt_length/mean:237.25 - prompt_length/max:817.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.229499846696854e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4140738854184747) - timing_s/agent_loop/generate_sequences/max:np.float64(27.454247524030507) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.794771860676519) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.454247524030507) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:208 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.46530770417303 - timing_s/reward:0.00011921394616365433 - timing_s/old_log_prob:10.657963559962809 - timing_s/ref:12.469615965150297 - timing_s/adv:0.0759829580783844 - timing_s/update_actor:23.950659655034542 - timing_s/update_weights:21.195460132323205 - timing_s/step:98.21583417244256 - timing_s/stop_profile:5.453824996948242e-05 - timing_per_token_ms/adv:6.440980442016852e-05 - timing_per_token_ms/update_actor:0.020302675009353843 - timing_per_token_ms/gen:0.02999660761830851 - timing_per_token_ms/ref:0.01057033768916172 - perf/total_num_tokens:1492066 - perf/time_per_step:98.21583417244256 - perf/throughput:3797.926303258555 - frontier/active_count:10.0 - frontier/completed_count:54.0 - frontier/blacklisted_count:1960.0 - frontier/mean_score:2.4646802053790298 - frontier/mean_frontier_pct:0.830743590357198 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:21.0 - frontier/replay_slots_count:48.0 - frontier/replay_pool_size:3578.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:208.0 - frontier/cluster_8/score:2.60050209484183 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:2.8059916613740534 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:2.7797193302747614 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.534364087107946 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.9728328038407927 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:1.3606411081343628 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:2.9280005566655687 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:2.7890933621010365 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:176.0 - frontier/cluster_51/score:1.9866000564499426 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:90.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.1055107307295436 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.11384810310279322 - cluster/prob_snapshot/cluster_13:0.11721833066597405 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.1127821501632616 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.10282729911884045 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0800441696060687 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.05520558428492419 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.11879839624935389 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.11316248477242577 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.08060275130681444 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  11%|█▏        | 91/800 [2:46:11<20:58:41, 106.52s/it]
[36m(TaskRunner pid=2823680)[0m step:91 - global_seqlen/min:311228 - global_seqlen/max:391915 - global_seqlen/minmax_diff:80687 - global_seqlen/balanced_min:362933 - global_seqlen/balanced_max:363001 - global_seqlen/mean:362979.5 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.283554085475557) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01058931089937687 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.025342798558995128) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003544819428854823) - actor/ppo_kl:np.float64(1.9199638066034894e-05) - actor/pg_clipfrac_lower:np.float64(1.6807458522117825e-06) - actor/grad_norm:np.float64(0.24154596145336443) - perf/mfu/actor:np.float64(0.18464433609420683) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.83532333374023) - actor/lr:np.float64(1e-06) - training/global_step:91 - training/epoch:0 - critic/score/mean:0.5123762488365173 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5037733912467957 - critic/rewards/max:1.0025763511657715 - critic/rewards/min:-0.056361597031354904 - critic/advantages/mean:-0.13536767661571503 - critic/advantages/max:2.4748356342315674 - critic/advantages/min:-2.474846601486206 - critic/returns/mean:-0.13536767661571503 - critic/returns/max:2.4748356342315674 - critic/returns/min:-2.474846601486206 - response_length/mean:1094.3118896484375 - response_length/max:8192.0 - response_length/min:100.0 - response_length/clip_ratio:0.006188118830323219 - response_length_non_aborted/mean:1094.3118896484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:100.0 - response_length_non_aborted/clip_ratio:0.006188118830323219 - response/aborted_ratio:0.0 - prompt_length/mean:233.09901428222656 - prompt_length/max:515.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.775200694799423e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8772010132670403) - timing_s/agent_loop/generate_sequences/max:np.float64(27.989268551580608) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.288769948287154) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.989268551580608) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:181 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.51212929096073 - timing_s/reward:0.0002213362604379654 - timing_s/old_log_prob:10.00353963766247 - timing_s/ref:19.055406224913895 - timing_s/adv:0.07498861849308014 - timing_s/update_actor:23.09201947785914 - timing_s/update_weights:24.649035609327257 - timing_s/step:106.77394589688629 - timing_s/stop_profile:6.069056689739227e-05 - timing_per_token_ms/adv:6.991632868000326e-05 - timing_per_token_ms/update_actor:0.021530056909209786 - timing_per_token_ms/gen:0.03337705924307143 - timing_per_token_ms/ref:0.017766483388075772 - perf/total_num_tokens:1451918 - perf/time_per_step:106.77394589688629 - perf/throughput:3399.513776052975 - frontier/active_count:9.0 - frontier/completed_count:55.0 - frontier/blacklisted_count:1986.0 - frontier/mean_score:2.640781084428794 - frontier/mean_frontier_pct:0.8609063339894897 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:5.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:22.0 - frontier/replay_slots_count:48.0 - frontier/replay_pool_size:3568.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:224.0 - frontier/cluster_8/score:2.120351466389281 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:2.8641941629618373 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.4458035311923325 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.674054860975562 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.2809829626885545 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:2.349600389665898 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:2.8523653534707254 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:2.29062003951496 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:91.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.08921398625798864 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.12051123728549629 - cluster/prob_snapshot/cluster_13:0.12155734318469255 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.14498250584984979 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.11251110837130578 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0959725714881283 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.09885965614576749 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.12001353901984721 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.09637805239692412 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▏        | 92/800 [2:47:57<20:55:53, 106.43s/it]
[36m(TaskRunner pid=2823680)[0m step:92 - global_seqlen/min:341452 - global_seqlen/max:393485 - global_seqlen/minmax_diff:52033 - global_seqlen/balanced_min:364459 - global_seqlen/balanced_max:364507 - global_seqlen/mean:364486.75 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.29085919994860887) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01057103555649519 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05512237460061442) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006161345671353047) - actor/ppo_kl:np.float64(0.00010658517275714985) - actor/pg_clipfrac_lower:np.float64(5.415888190327678e-06) - actor/grad_norm:np.float64(0.24787008647735304) - perf/mfu/actor:np.float64(0.202752942138242) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.24027252197266) - actor/lr:np.float64(1e-06) - training/global_step:92 - training/epoch:0 - critic/score/mean:0.5694444179534912 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5614131689071655 - critic/rewards/max:1.0091966390609741 - critic/rewards/min:-0.0391756072640419 - critic/advantages/mean:-0.18602225184440613 - critic/advantages/max:2.4748306274414062 - critic/advantages/min:-2.474855661392212 - critic/returns/mean:-0.18602225184440613 - critic/returns/max:2.4748306274414062 - critic/returns/min:-2.474855661392212 - response_length/mean:1103.7449951171875 - response_length/max:8192.0 - response_length/min:210.0 - response_length/clip_ratio:0.008838383480906487 - response_length_non_aborted/mean:1103.7449951171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:210.0 - response_length_non_aborted/clip_ratio:0.008838383480906487 - response/aborted_ratio:0.0 - prompt_length/mean:226.5757598876953 - prompt_length/max:572.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.827261626720428e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6663475101813674) - timing_s/agent_loop/generate_sequences/max:np.float64(28.61599847022444) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.622921933823818) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.61599847022444) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.132956322282553 - timing_s/reward:0.00012580770999193192 - timing_s/old_log_prob:10.457074660807848 - timing_s/ref:17.77492267359048 - timing_s/adv:0.12460378650575876 - timing_s/update_actor:21.10211363993585 - timing_s/update_weights:25.99036247562617 - timing_s/step:105.9570402726531 - timing_s/stop_profile:5.638599395751953e-05 - timing_per_token_ms/adv:0.00011826322211527064 - timing_per_token_ms/update_actor:0.020028315531053924 - timing_per_token_ms/gen:0.034470519697955025 - timing_per_token_ms/ref:0.016870431366316774 - perf/total_num_tokens:1457947 - perf/time_per_step:105.9570402726531 - perf/throughput:3439.948389102672 - frontier/active_count:9.0 - frontier/completed_count:55.0 - frontier/blacklisted_count:2014.0 - frontier/mean_score:2.678181992200156 - frontier/mean_frontier_pct:0.8929689278887223 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:22.0 - frontier/replay_slots_count:56.0 - frontier/replay_pool_size:3561.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:240.0 - frontier/cluster_8/score:1.7842460264724966 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:208.0 - frontier/cluster_12/score:2.904935914073286 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.3120624718346323 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:2.771838402682893 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:2.496688073881988 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:2.5447202727661287 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:2.8966557474295076 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:2.5034340276604716 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:92.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.07402393081363372 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.12051856746826019 - cluster/prob_snapshot/cluster_13:0.11985979051850962 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.13740923596183147 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.11499668268978727 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.1035813797549256 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.10557411624657172 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.12017504394422232 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.10386125260225804 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▏        | 93/800 [2:49:45<20:59:34, 106.90s/it]
[36m(TaskRunner pid=2823680)[0m step:93 - global_seqlen/min:319243 - global_seqlen/max:417135 - global_seqlen/minmax_diff:97892 - global_seqlen/balanced_min:369111 - global_seqlen/balanced_max:369352 - global_seqlen/mean:369241.5 - frontier/skipped_zero_acc_count:21.0 - actor/entropy:np.float64(0.28024871316221023) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01123441755771637 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.00866458572272677) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000518827498160169) - actor/ppo_kl:np.float64(0.00012883694646735353) - actor/pg_clipfrac_lower:np.float64(8.914639693102799e-07) - actor/grad_norm:np.float64(0.25110516803605215) - perf/mfu/actor:np.float64(0.15099348324590509) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.4041633605957) - actor/lr:np.float64(1e-06) - training/global_step:93 - training/epoch:0 - critic/score/mean:0.5829439163208008 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5736798644065857 - critic/rewards/max:1.003051996231079 - critic/rewards/min:-0.05078381672501564 - critic/advantages/mean:-0.17076148092746735 - critic/advantages/max:2.4748425483703613 - critic/advantages/min:-2.4748423099517822 - critic/returns/mean:-0.17076148092746735 - critic/returns/max:2.4748425483703613 - critic/returns/min:-2.4748423099517822 - response_length/mean:1109.98486328125 - response_length/max:8192.0 - response_length/min:155.0 - response_length/clip_ratio:0.011682243086397648 - response_length_non_aborted/mean:1109.98486328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:155.0 - response_length_non_aborted/clip_ratio:0.011682243086397648 - response/aborted_ratio:0.0 - prompt_length/mean:233.6728973388672 - prompt_length/max:447.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.826795965433121e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6298369765281677) - timing_s/agent_loop/generate_sequences/max:np.float64(28.527104194276035) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.822367272268821) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.527104194276035) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:192 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.520809738896787 - timing_s/reward:0.00018781889230012894 - timing_s/old_log_prob:11.244327023625374 - timing_s/ref:11.655688308179379 - timing_s/adv:0.12525774911046028 - timing_s/update_actor:31.10750286374241 - timing_s/update_weights:22.652216041460633 - timing_s/step:107.71156236995012 - timing_s/stop_profile:6.70338049530983e-05 - timing_per_token_ms/adv:0.00010890358834508981 - timing_per_token_ms/update_actor:0.027045980870446577 - timing_per_token_ms/gen:0.032122197658779945 - timing_per_token_ms/ref:0.01013387427450299 - perf/total_num_tokens:1476966 - perf/time_per_step:107.71156236995012 - perf/throughput:3428.058157134417 - frontier/active_count:8.0 - frontier/completed_count:56.0 - frontier/blacklisted_count:2035.0 - frontier/mean_score:2.773786428778779 - frontier/mean_frontier_pct:0.9239218644986946 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:22.0 - frontier/replay_slots_count:56.0 - frontier/replay_pool_size:3548.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:224.0 - frontier/cluster_12/score:2.3334551398512997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:176.0 - frontier/cluster_18/score:3.2184437302842426 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:2.8402868818780247 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:2.6476816517173916 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:224.0 - frontier/cluster_39/score:2.68130419093629 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:2.927659023200655 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:2.6524038193623296 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:93.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.10515657927197801 - cluster/prob_snapshot/cluster_13:0.1301946394928453 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.14503837141587508 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.12799682648640887 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.11931711938268676 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.12083231080433225 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.13193423044513297 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.11952992270074075 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▏        | 94/800 [2:51:23<20:25:48, 104.18s/it]
[36m(TaskRunner pid=2823680)[0m step:94 - global_seqlen/min:320503 - global_seqlen/max:376994 - global_seqlen/minmax_diff:56491 - global_seqlen/balanced_min:351143 - global_seqlen/balanced_max:351351 - global_seqlen/mean:351234.5 - frontier/skipped_zero_acc_count:19.0 - actor/entropy:np.float64(0.2656381517310034) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0112679498270154 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05679827177482366) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003165950324645647) - actor/ppo_kl:np.float64(8.447941110034662e-06) - actor/pg_clipfrac_lower:np.float64(3.283813302087682e-07) - actor/grad_norm:np.float64(0.2295310220548085) - perf/mfu/actor:np.float64(0.17764183183392435) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.37950897216797) - actor/lr:np.float64(1e-06) - training/global_step:94 - training/epoch:0 - critic/score/mean:0.6100917458534241 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6014403104782104 - critic/rewards/max:1.0136743783950806 - critic/rewards/min:-0.10330664366483688 - critic/advantages/mean:-0.15687435865402222 - critic/advantages/max:2.4748382568359375 - critic/advantages/min:-2.4748446941375732 - critic/returns/mean:-0.15687435865402222 - critic/returns/max:2.4748382568359375 - critic/returns/min:-2.4748446941375732 - response_length/mean:1047.23046875 - response_length/max:8192.0 - response_length/min:120.0 - response_length/clip_ratio:0.0022935778833925724 - response_length_non_aborted/mean:1047.23046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:120.0 - response_length_non_aborted/clip_ratio:0.0022935778833925724 - response/aborted_ratio:0.0 - prompt_length/mean:247.32110595703125 - prompt_length/max:817.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.120814502239227e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0285601653158665) - timing_s/agent_loop/generate_sequences/max:np.float64(26.947556698694825) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.157153133216525) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(26.947556698694825) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:198 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.831216561608016 - timing_s/reward:0.00013233255594968796 - timing_s/old_log_prob:9.93299622926861 - timing_s/ref:14.66291887499392 - timing_s/adv:0.07832138799130917 - timing_s/update_actor:22.994234315119684 - timing_s/update_weights:20.69613891839981 - timing_s/step:97.60312307998538 - timing_s/stop_profile:5.59808686375618e-05 - timing_per_token_ms/adv:6.938163385121409e-05 - timing_per_token_ms/update_actor:0.02036962810359905 - timing_per_token_ms/gen:0.03157215302661347 - timing_per_token_ms/ref:0.01298926506113211 - perf/total_num_tokens:1404938 - perf/time_per_step:97.60312307998538 - perf/throughput:3598.598988601673 - frontier/active_count:7.0 - frontier/completed_count:57.0 - frontier/blacklisted_count:2054.0 - frontier/mean_score:2.744531468820935 - frontier/mean_frontier_pct:0.957640373742943 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:4.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:22.0 - frontier/replay_slots_count:64.0 - frontier/replay_pool_size:3726.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:240.0 - frontier/cluster_12/score:1.9334185978959098 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.889056992999999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:3.7529106111989696 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:224.0 - frontier/cluster_23/score:2.753377156202174 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:240.0 - frontier/cluster_39/score:2.1769129336554025 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:2.9493613162404584 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:2.7566826735536303 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:94.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.10063745305166093 - cluster/prob_snapshot/cluster_13:0.15037992176811737 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.19534484971471755 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.1433175746795676 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.11331171294034158 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.15351885583316077 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.1434896320124342 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▏        | 95/800 [2:52:50<19:25:40, 99.21s/it] 
[36m(TaskRunner pid=2823680)[0m step:95 - global_seqlen/min:346802 - global_seqlen/max:435824 - global_seqlen/minmax_diff:89022 - global_seqlen/balanced_min:392520 - global_seqlen/balanced_max:392622 - global_seqlen/mean:392564.25 - frontier/skipped_zero_acc_count:55.0 - actor/entropy:np.float64(0.22288898142004335) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01038268394768238 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09370100137311965) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021103592336805524) - actor/ppo_kl:np.float64(-5.45188936945812e-06) - actor/pg_clipfrac_lower:np.float64(1.9881189891256745e-06) - actor/grad_norm:np.float64(0.20209936797618866) - perf/mfu/actor:np.float64(0.2616302787139639) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.36264419555664) - actor/lr:np.float64(1e-06) - training/global_step:95 - training/epoch:0 - critic/score/mean:0.551369845867157 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5424649119377136 - critic/rewards/max:0.9997177124023438 - critic/rewards/min:-0.07344706356525421 - critic/advantages/mean:-0.11179686337709427 - critic/advantages/max:2.4748423099517822 - critic/advantages/min:-2.474846363067627 - critic/returns/mean:-0.11179686337709427 - critic/returns/max:2.4748423099517822 - critic/returns/min:-2.474846363067627 - response_length/mean:1131.73974609375 - response_length/max:8192.0 - response_length/min:212.0 - response_length/clip_ratio:0.010273972526192665 - response_length_non_aborted/mean:1131.73974609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:212.0 - response_length_non_aborted/clip_ratio:0.010273972526192665 - response/aborted_ratio:0.0 - prompt_length/mean:235.36985778808594 - prompt_length/max:474.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.478295058012009e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6229922771453857) - timing_s/agent_loop/generate_sequences/max:np.float64(28.04005645494908) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.958459687638424) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.04005645494908) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:242 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.568020552396774 - timing_s/reward:0.00015556532889604568 - timing_s/old_log_prob:8.235336585901678 - timing_s/ref:11.81851808540523 - timing_s/adv:0.09641009662300348 - timing_s/update_actor:17.60532370302826 - timing_s/update_weights:19.73778995219618 - timing_s/step:87.41338346432894 - timing_s/stop_profile:5.5013224482536316e-05 - timing_per_token_ms/adv:0.00012075533901016478 - timing_per_token_ms/update_actor:0.022050977092741737 - timing_per_token_ms/gen:0.04473658652637589 - timing_per_token_ms/ref:0.01480290143864822 - perf/total_num_tokens:1570257 - perf/time_per_step:87.41338346432894 - perf/throughput:4490.894122182045 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:55.0 - frontier/mean_score:2.0 - frontier/mean_frontier_pct:0.018476526169228624 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.3 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.7 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:95.0 - cluster/prob_snapshot/cluster_0:0.015625 - cluster/prob_snapshot/cluster_1:0.015625 - cluster/prob_snapshot/cluster_2:0.01328125 - cluster/prob_snapshot/cluster_3:0.015625 - cluster/prob_snapshot/cluster_4:0.01328125 - cluster/prob_snapshot/cluster_5:0.015625 - cluster/prob_snapshot/cluster_6:0.015625 - cluster/prob_snapshot/cluster_7:0.015625 - cluster/prob_snapshot/cluster_8:0.015625 - cluster/prob_snapshot/cluster_9:0.01328125 - cluster/prob_snapshot/cluster_10:0.015625 - cluster/prob_snapshot/cluster_11:0.01328125 - cluster/prob_snapshot/cluster_12:0.015625 - cluster/prob_snapshot/cluster_13:0.015625 - cluster/prob_snapshot/cluster_14:0.015625 - cluster/prob_snapshot/cluster_15:0.01796875 - cluster/prob_snapshot/cluster_16:0.015625 - cluster/prob_snapshot/cluster_17:0.015625 - cluster/prob_snapshot/cluster_18:0.015625 - cluster/prob_snapshot/cluster_19:0.015625 - cluster/prob_snapshot/cluster_20:0.015625 - cluster/prob_snapshot/cluster_21:0.015625 - cluster/prob_snapshot/cluster_22:0.015625 - cluster/prob_snapshot/cluster_23:0.015625 - cluster/prob_snapshot/cluster_24:0.015625 - cluster/prob_snapshot/cluster_25:0.015625 - cluster/prob_snapshot/cluster_26:0.015625 - cluster/prob_snapshot/cluster_27:0.015625 - cluster/prob_snapshot/cluster_28:0.015625 - cluster/prob_snapshot/cluster_29:0.015625 - cluster/prob_snapshot/cluster_30:0.015625 - cluster/prob_snapshot/cluster_31:0.015625 - cluster/prob_snapshot/cluster_32:0.015625 - cluster/prob_snapshot/cluster_33:0.015625 - cluster/prob_snapshot/cluster_34:0.015625 - cluster/prob_snapshot/cluster_35:0.01796875 - cluster/prob_snapshot/cluster_36:0.01796875 - cluster/prob_snapshot/cluster_37:0.015625 - cluster/prob_snapshot/cluster_38:0.015625 - cluster/prob_snapshot/cluster_39:0.01796875 - cluster/prob_snapshot/cluster_40:0.02265625 - cluster/prob_snapshot/cluster_41:0.015625 - cluster/prob_snapshot/cluster_42:0.015625 - cluster/prob_snapshot/cluster_43:0.015625 - cluster/prob_snapshot/cluster_44:0.015625 - cluster/prob_snapshot/cluster_45:0.01328125 - cluster/prob_snapshot/cluster_46:0.015625 - cluster/prob_snapshot/cluster_47:0.015625 - cluster/prob_snapshot/cluster_48:0.015625 - cluster/prob_snapshot/cluster_49:0.015625 - cluster/prob_snapshot/cluster_50:0.01328125 - cluster/prob_snapshot/cluster_51:0.01796875 - cluster/prob_snapshot/cluster_52:0.015625 - cluster/prob_snapshot/cluster_53:0.015625 - cluster/prob_snapshot/cluster_54:0.015625 - cluster/prob_snapshot/cluster_55:0.01328125 - cluster/prob_snapshot/cluster_56:0.01328125 - cluster/prob_snapshot/cluster_57:0.015625 - cluster/prob_snapshot/cluster_58:0.015625 - cluster/prob_snapshot/cluster_59:0.01328125 - cluster/prob_snapshot/cluster_60:0.015625 - cluster/prob_snapshot/cluster_61:0.015625 - cluster/prob_snapshot/cluster_62:0.01796875 - cluster/prob_snapshot/cluster_63:0.015625
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▏        | 96/800 [2:54:36<19:47:44, 101.23s/it]
[36m(TaskRunner pid=2823680)[0m step:96 - global_seqlen/min:389202 - global_seqlen/max:459144 - global_seqlen/minmax_diff:69942 - global_seqlen/balanced_min:414995 - global_seqlen/balanced_max:415086 - global_seqlen/mean:415045.25 - frontier/skipped_zero_acc_count:46.0 - actor/entropy:np.float64(0.22687918488390563) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008760394528508186 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03730329426980461) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021628626191380957) - actor/ppo_kl:np.float64(4.112341676640116e-06) - actor/pg_clipfrac_lower:np.float64(2.865664522155052e-07) - actor/grad_norm:np.float64(0.19586400145834143) - perf/mfu/actor:np.float64(0.28658885780862486) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.05926513671875) - actor/lr:np.float64(1e-06) - training/global_step:96 - training/epoch:0 - critic/score/mean:0.5762194991111755 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5680691003799438 - critic/rewards/max:1.0049947500228882 - critic/rewards/min:-0.03553783893585205 - critic/advantages/mean:-0.12639859318733215 - critic/advantages/max:2.4748432636260986 - critic/advantages/min:-2.4748575687408447 - critic/returns/mean:-0.12639859318733215 - critic/returns/max:2.4748432636260986 - critic/returns/min:-2.4748575687408447 - response_length/mean:1235.856689453125 - response_length/max:8192.0 - response_length/min:214.0 - response_length/clip_ratio:0.006097560748457909 - response_length_non_aborted/mean:1235.856689453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:214.0 - response_length_non_aborted/clip_ratio:0.006097560748457909 - response/aborted_ratio:0.0 - prompt_length/mean:238.40243530273438 - prompt_length/max:377.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.493196219205856e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6312720524147153) - timing_s/agent_loop/generate_sequences/max:np.float64(28.77673495747149) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.3949003074940265) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.77673495747149) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.68078514561057 - timing_s/reward:0.00020426325500011444 - timing_s/old_log_prob:8.525944804772735 - timing_s/ref:20.788572234101593 - timing_s/adv:0.056160020641982555 - timing_s/update_actor:17.05478247627616 - timing_s/update_weights:28.242872138507664 - timing_s/step:105.70939242374152 - timing_s/stop_profile:5.227699875831604e-05 - timing_per_token_ms/adv:5.806970082325616e-05 - timing_per_token_ms/update_actor:0.01763471780604578 - timing_per_token_ms/gen:0.03784378016830747 - timing_per_token_ms/ref:0.021495472337388966 - perf/total_num_tokens:1660181 - perf/time_per_step:105.70939242374152 - perf/throughput:3926.2854556600782 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:101.0 - frontier/mean_score:2.02390625 - frontier/mean_frontier_pct:0.028592420937960283 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.51 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.3 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.3 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.09 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:1.7 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:96.0 - cluster/prob_snapshot/cluster_0:0.01544043850845364 - cluster/prob_snapshot/cluster_1:0.01544043850845364 - cluster/prob_snapshot/cluster_2:0.011503126688797962 - cluster/prob_snapshot/cluster_3:0.01544043850845364 - cluster/prob_snapshot/cluster_4:0.013124372732185594 - cluster/prob_snapshot/cluster_5:0.01544043850845364 - cluster/prob_snapshot/cluster_6:0.01544043850845364 - cluster/prob_snapshot/cluster_7:0.01544043850845364 - cluster/prob_snapshot/cluster_8:0.01544043850845364 - cluster/prob_snapshot/cluster_9:0.013124372732185594 - cluster/prob_snapshot/cluster_10:0.01544043850845364 - cluster/prob_snapshot/cluster_11:0.013124372732185594 - cluster/prob_snapshot/cluster_12:0.017756504284721683 - cluster/prob_snapshot/cluster_13:0.01544043850845364 - cluster/prob_snapshot/cluster_14:0.017756504284721683 - cluster/prob_snapshot/cluster_15:0.017756504284721683 - cluster/prob_snapshot/cluster_16:0.01544043850845364 - cluster/prob_snapshot/cluster_17:0.01544043850845364 - cluster/prob_snapshot/cluster_18:0.01544043850845364 - cluster/prob_snapshot/cluster_19:0.01544043850845364 - cluster/prob_snapshot/cluster_20:0.01544043850845364 - cluster/prob_snapshot/cluster_21:0.01544043850845364 - cluster/prob_snapshot/cluster_22:0.01544043850845364 - cluster/prob_snapshot/cluster_23:0.013124372732185594 - cluster/prob_snapshot/cluster_24:0.017756504284721683 - cluster/prob_snapshot/cluster_25:0.013124372732185594 - cluster/prob_snapshot/cluster_26:0.01544043850845364 - cluster/prob_snapshot/cluster_27:0.017756504284721683 - cluster/prob_snapshot/cluster_28:0.01544043850845364 - cluster/prob_snapshot/cluster_29:0.01544043850845364 - cluster/prob_snapshot/cluster_30:0.013124372732185594 - cluster/prob_snapshot/cluster_31:0.01544043850845364 - cluster/prob_snapshot/cluster_32:0.01544043850845364 - cluster/prob_snapshot/cluster_33:0.01544043850845364 - cluster/prob_snapshot/cluster_34:0.01544043850845364 - cluster/prob_snapshot/cluster_35:0.019377750328109317 - cluster/prob_snapshot/cluster_36:0.017756504284721683 - cluster/prob_snapshot/cluster_37:0.01544043850845364 - cluster/prob_snapshot/cluster_38:0.01544043850845364 - cluster/prob_snapshot/cluster_39:0.017756504284721683 - cluster/prob_snapshot/cluster_40:0.027252373967420675 - cluster/prob_snapshot/cluster_41:0.01544043850845364 - cluster/prob_snapshot/cluster_42:0.01544043850845364 - cluster/prob_snapshot/cluster_43:0.01544043850845364 - cluster/prob_snapshot/cluster_44:0.017756504284721683 - cluster/prob_snapshot/cluster_45:0.013124372732185594 - cluster/prob_snapshot/cluster_46:0.01544043850845364 - cluster/prob_snapshot/cluster_47:0.01544043850845364 - cluster/prob_snapshot/cluster_48:0.01544043850845364 - cluster/prob_snapshot/cluster_49:0.017756504284721683 - cluster/prob_snapshot/cluster_50:0.013124372732185594 - cluster/prob_snapshot/cluster_51:0.014745618775573226 - cluster/prob_snapshot/cluster_52:0.01544043850845364 - cluster/prob_snapshot/cluster_53:0.013124372732185594 - cluster/prob_snapshot/cluster_54:0.01544043850845364 - cluster/prob_snapshot/cluster_55:0.013124372732185594 - cluster/prob_snapshot/cluster_56:0.016135258241334053 - cluster/prob_snapshot/cluster_57:0.01544043850845364 - cluster/prob_snapshot/cluster_58:0.017756504284721683 - cluster/prob_snapshot/cluster_59:0.013124372732185594 - cluster/prob_snapshot/cluster_60:0.01544043850845364 - cluster/prob_snapshot/cluster_61:0.01544043850845364 - cluster/prob_snapshot/cluster_62:0.017756504284721683 - cluster/prob_snapshot/cluster_63:0.01544043850845364
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▏        | 97/800 [2:56:24<20:10:13, 103.29s/it]
[36m(TaskRunner pid=2823680)[0m step:97 - global_seqlen/min:307549 - global_seqlen/max:447760 - global_seqlen/minmax_diff:140211 - global_seqlen/balanced_min:376351 - global_seqlen/balanced_max:376489 - global_seqlen/mean:376443.5 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.21732885093289486) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009002834558486938 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0550703960852843) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002718301629043863) - actor/ppo_kl:np.float64(8.563673115052314e-06) - actor/pg_clipfrac_lower:np.float64(1.9232494060874006e-06) - actor/grad_norm:np.float64(0.20943877597649893) - perf/mfu/actor:np.float64(0.21579522211463104) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.35633850097656) - actor/lr:np.float64(1e-06) - training/global_step:97 - training/epoch:0 - critic/score/mean:0.5147849321365356 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5063607692718506 - critic/rewards/max:1.0076457262039185 - critic/rewards/min:-0.06664498150348663 - critic/advantages/mean:-0.11551809310913086 - critic/advantages/max:2.4748458862304688 - critic/advantages/min:-2.474851608276367 - critic/returns/mean:-0.11551809310913086 - critic/returns/max:2.4748458862304688 - critic/returns/min:-2.474851608276367 - response_length/mean:1181.938232421875 - response_length/max:8192.0 - response_length/min:237.0 - response_length/clip_ratio:0.004032257944345474 - response_length_non_aborted/mean:1181.938232421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:237.0 - response_length_non_aborted/clip_ratio:0.004032257944345474 - response/aborted_ratio:0.0 - prompt_length/mean:233.8279571533203 - prompt_length/max:381.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.463114500045776e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7028677007183433) - timing_s/agent_loop/generate_sequences/max:np.float64(27.740589884109795) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.879515149445979) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.740589884109795) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:237 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.563714381307364 - timing_s/reward:0.00011610332876443863 - timing_s/old_log_prob:8.416680400259793 - timing_s/ref:21.265452274121344 - timing_s/adv:0.06505323573946953 - timing_s/update_actor:20.315970972180367 - timing_s/update_weights:27.82171211671084 - timing_s/step:107.88171530980617 - timing_s/stop_profile:5.3404830396175385e-05 - timing_per_token_ms/adv:6.175959646024468e-05 - timing_per_token_ms/update_actor:0.01928737525009291 - timing_per_token_ms/gen:0.03361950411924482 - timing_per_token_ms/ref:0.020188784401964574 - perf/total_num_tokens:1505774 - perf/time_per_step:107.88171530980617 - perf/throughput:3489.4096642694212 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:136.0 - frontier/mean_score:2.062203125 - frontier/mean_frontier_pct:0.04059235727845749 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.3 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.9 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.3 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.51 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.51 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.3709999999999996 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.3 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.3 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.09 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.09 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:97.0 - cluster/prob_snapshot/cluster_0:0.017426750820193813 - cluster/prob_snapshot/cluster_1:0.015153696365385928 - cluster/prob_snapshot/cluster_2:0.011289503792212515 - cluster/prob_snapshot/cluster_3:0.015153696365385928 - cluster/prob_snapshot/cluster_4:0.012880641910578038 - cluster/prob_snapshot/cluster_5:0.015153696365385928 - cluster/prob_snapshot/cluster_6:0.015153696365385928 - cluster/prob_snapshot/cluster_7:0.021972859729809596 - cluster/prob_snapshot/cluster_8:0.017426750820193813 - cluster/prob_snapshot/cluster_9:0.012880641910578038 - cluster/prob_snapshot/cluster_10:0.015153696365385928 - cluster/prob_snapshot/cluster_11:0.012880641910578038 - cluster/prob_snapshot/cluster_12:0.019017888938559338 - cluster/prob_snapshot/cluster_13:0.017426750820193813 - cluster/prob_snapshot/cluster_14:0.019017888938559338 - cluster/prob_snapshot/cluster_15:0.017426750820193813 - cluster/prob_snapshot/cluster_16:0.015153696365385928 - cluster/prob_snapshot/cluster_17:0.015153696365385928 - cluster/prob_snapshot/cluster_18:0.015153696365385928 - cluster/prob_snapshot/cluster_19:0.017426750820193813 - cluster/prob_snapshot/cluster_20:0.015153696365385928 - cluster/prob_snapshot/cluster_21:0.015153696365385928 - cluster/prob_snapshot/cluster_22:0.012880641910578038 - cluster/prob_snapshot/cluster_23:0.012880641910578038 - cluster/prob_snapshot/cluster_24:0.017426750820193813 - cluster/prob_snapshot/cluster_25:0.012880641910578038 - cluster/prob_snapshot/cluster_26:0.015153696365385928 - cluster/prob_snapshot/cluster_27:0.017426750820193813 - cluster/prob_snapshot/cluster_28:0.015153696365385928 - cluster/prob_snapshot/cluster_29:0.015153696365385928 - cluster/prob_snapshot/cluster_30:0.012880641910578038 - cluster/prob_snapshot/cluster_31:0.015153696365385928 - cluster/prob_snapshot/cluster_32:0.015153696365385928 - cluster/prob_snapshot/cluster_33:0.012880641910578038 - cluster/prob_snapshot/cluster_34:0.015153696365385928 - cluster/prob_snapshot/cluster_35:0.019017888938559338 - cluster/prob_snapshot/cluster_36:0.019017888938559338 - cluster/prob_snapshot/cluster_37:0.015153696365385928 - cluster/prob_snapshot/cluster_38:0.015153696365385928 - cluster/prob_snapshot/cluster_39:0.017426750820193813 - cluster/prob_snapshot/cluster_40:0.025541555223857978 - cluster/prob_snapshot/cluster_41:0.017426750820193813 - cluster/prob_snapshot/cluster_42:0.015153696365385928 - cluster/prob_snapshot/cluster_43:0.015153696365385928 - cluster/prob_snapshot/cluster_44:0.017426750820193813 - cluster/prob_snapshot/cluster_45:0.012880641910578038 - cluster/prob_snapshot/cluster_46:0.017426750820193813 - cluster/prob_snapshot/cluster_47:0.015153696365385928 - cluster/prob_snapshot/cluster_48:0.015153696365385928 - cluster/prob_snapshot/cluster_49:0.017426750820193813 - cluster/prob_snapshot/cluster_50:0.011289503792212515 - cluster/prob_snapshot/cluster_51:0.01447178002894356 - cluster/prob_snapshot/cluster_52:0.015153696365385928 - cluster/prob_snapshot/cluster_53:0.012880641910578038 - cluster/prob_snapshot/cluster_54:0.012880641910578038 - cluster/prob_snapshot/cluster_55:0.012880641910578038 - cluster/prob_snapshot/cluster_56:0.015835612701828292 - cluster/prob_snapshot/cluster_57:0.015153696365385928 - cluster/prob_snapshot/cluster_58:0.017426750820193813 - cluster/prob_snapshot/cluster_59:0.015835612701828292 - cluster/prob_snapshot/cluster_60:0.015153696365385928 - cluster/prob_snapshot/cluster_61:0.015153696365385928 - cluster/prob_snapshot/cluster_62:0.017426750820193813 - cluster/prob_snapshot/cluster_63:0.015153696365385928
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 14:28:49,448:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▏        | 98/800 [2:58:15<20:33:27, 105.42s/it]
[36m(TaskRunner pid=2823680)[0m step:98 - global_seqlen/min:327692 - global_seqlen/max:425142 - global_seqlen/minmax_diff:97450 - global_seqlen/balanced_min:372254 - global_seqlen/balanced_max:372382 - global_seqlen/mean:372337.25 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.2394504501693407) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010825589299201965 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0232666637893999) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006994628164815071) - actor/ppo_kl:np.float64(4.587005825057104e-05) - actor/pg_clipfrac_lower:np.float64(5.044147526080321e-06) - actor/grad_norm:np.float64(0.2666870206594467) - perf/mfu/actor:np.float64(0.20127247619242783) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.25281524658203) - actor/lr:np.float64(1e-06) - training/global_step:98 - training/epoch:0 - critic/score/mean:0.5178571343421936 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5088238716125488 - critic/rewards/max:1.0043824911117554 - critic/rewards/min:-0.047552164644002914 - critic/advantages/mean:-0.10834790766239166 - critic/advantages/max:2.4747650623321533 - critic/advantages/min:-2.4748497009277344 - critic/returns/mean:-0.10834790766239166 - critic/returns/max:2.4747650623321533 - critic/returns/min:-2.4748497009277344 - response_length/mean:1147.36474609375 - response_length/max:8192.0 - response_length/min:202.0 - response_length/clip_ratio:0.011479591950774193 - response_length_non_aborted/mean:1147.36474609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:202.0 - response_length_non_aborted/clip_ratio:0.011479591950774193 - response/aborted_ratio:0.0 - prompt_length/mean:227.91836547851562 - prompt_length/max:327.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.410215377807617e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5495448596775532) - timing_s/agent_loop/generate_sequences/max:np.float64(28.47766407672316) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.68324373834821) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.47766407672316) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:221 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.42362423427403 - timing_s/reward:0.00014662370085716248 - timing_s/old_log_prob:10.244416361674666 - timing_s/ref:19.772265532054007 - timing_s/adv:0.07211283128708601 - timing_s/update_actor:21.59484920743853 - timing_s/update_weights:26.94834312144667 - timing_s/step:109.46499306149781 - timing_s/stop_profile:5.8935023844242096e-05 - timing_per_token_ms/adv:6.688124642892281e-05 - timing_per_token_ms/update_actor:0.020028203104220214 - timing_per_token_ms/gen:0.03382153896825915 - timing_per_token_ms/ref:0.01833784279309271 - perf/total_num_tokens:1489349 - perf/time_per_step:109.46499306149781 - perf/throughput:3401.427612486301 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:166.0 - frontier/mean_score:2.0776671875 - frontier/mean_frontier_pct:0.06516847147133709 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.11 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.9 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.3 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.0569999999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:1.7 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:1.7 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.6569999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:1.91 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.3 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.51 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.09 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.09 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:98.0 - cluster/prob_snapshot/cluster_0:0.023388611175243868 - cluster/prob_snapshot/cluster_1:0.015040907508195415 - cluster/prob_snapshot/cluster_2:0.011205476093605584 - cluster/prob_snapshot/cluster_3:0.015040907508195415 - cluster/prob_snapshot/cluster_4:0.012784771381966103 - cluster/prob_snapshot/cluster_5:0.015040907508195415 - cluster/prob_snapshot/cluster_6:0.015040907508195415 - cluster/prob_snapshot/cluster_7:0.02180931588688335 - cluster/prob_snapshot/cluster_8:0.017297043634424726 - cluster/prob_snapshot/cluster_9:0.012784771381966103 - cluster/prob_snapshot/cluster_10:0.015040907508195415 - cluster/prob_snapshot/cluster_11:0.012784771381966103 - cluster/prob_snapshot/cluster_12:0.01546957337217898 - cluster/prob_snapshot/cluster_13:0.017297043634424726 - cluster/prob_snapshot/cluster_14:0.019981845624637607 - cluster/prob_snapshot/cluster_15:0.017297043634424726 - cluster/prob_snapshot/cluster_16:0.015040907508195415 - cluster/prob_snapshot/cluster_17:0.012784771381966103 - cluster/prob_snapshot/cluster_18:0.015040907508195415 - cluster/prob_snapshot/cluster_19:0.018876338922785244 - cluster/prob_snapshot/cluster_20:0.015040907508195415 - cluster/prob_snapshot/cluster_21:0.012784771381966103 - cluster/prob_snapshot/cluster_22:0.012784771381966103 - cluster/prob_snapshot/cluster_23:0.012784771381966103 - cluster/prob_snapshot/cluster_24:0.017297043634424726 - cluster/prob_snapshot/cluster_25:0.012784771381966103 - cluster/prob_snapshot/cluster_26:0.015040907508195415 - cluster/prob_snapshot/cluster_27:0.018876338922785244 - cluster/prob_snapshot/cluster_28:0.015040907508195415 - cluster/prob_snapshot/cluster_29:0.015040907508195415 - cluster/prob_snapshot/cluster_30:0.012784771381966103 - cluster/prob_snapshot/cluster_31:0.015040907508195415 - cluster/prob_snapshot/cluster_32:0.012784771381966103 - cluster/prob_snapshot/cluster_33:0.012784771381966103 - cluster/prob_snapshot/cluster_34:0.017297043634424726 - cluster/prob_snapshot/cluster_35:0.019981845624637607 - cluster/prob_snapshot/cluster_36:0.018876338922785244 - cluster/prob_snapshot/cluster_37:0.015040907508195415 - cluster/prob_snapshot/cluster_38:0.015040907508195415 - cluster/prob_snapshot/cluster_39:0.01436406667032662 - cluster/prob_snapshot/cluster_40:0.02902669535469092 - cluster/prob_snapshot/cluster_41:0.017297043634424726 - cluster/prob_snapshot/cluster_42:0.015040907508195415 - cluster/prob_snapshot/cluster_43:0.015040907508195415 - cluster/prob_snapshot/cluster_44:0.017297043634424726 - cluster/prob_snapshot/cluster_45:0.012784771381966103 - cluster/prob_snapshot/cluster_46:0.017297043634424726 - cluster/prob_snapshot/cluster_47:0.015040907508195415 - cluster/prob_snapshot/cluster_48:0.015040907508195415 - cluster/prob_snapshot/cluster_49:0.018876338922785244 - cluster/prob_snapshot/cluster_50:0.011205476093605584 - cluster/prob_snapshot/cluster_51:0.01436406667032662 - cluster/prob_snapshot/cluster_52:0.015040907508195415 - cluster/prob_snapshot/cluster_53:0.012784771381966103 - cluster/prob_snapshot/cluster_54:0.012784771381966103 - cluster/prob_snapshot/cluster_55:0.012784771381966103 - cluster/prob_snapshot/cluster_56:0.01571774834606421 - cluster/prob_snapshot/cluster_57:0.017297043634424726 - cluster/prob_snapshot/cluster_58:0.017297043634424726 - cluster/prob_snapshot/cluster_59:0.01571774834606421 - cluster/prob_snapshot/cluster_60:0.012784771381966103 - cluster/prob_snapshot/cluster_61:0.015040907508195415 - cluster/prob_snapshot/cluster_62:0.018876338922785244 - cluster/prob_snapshot/cluster_63:0.015040907508195415
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 14:30:40,017:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▏        | 99/800 [3:00:00<20:29:57, 105.27s/it]
[36m(TaskRunner pid=2823680)[0m step:99 - global_seqlen/min:331329 - global_seqlen/max:418743 - global_seqlen/minmax_diff:87414 - global_seqlen/balanced_min:375100 - global_seqlen/balanced_max:375321 - global_seqlen/mean:375214.75 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.23300600825912424) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010592211037874222 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.015260332322213799) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00031302584785508874) - actor/ppo_kl:np.float64(4.8722836001186604e-05) - actor/pg_clipfrac_lower:np.float64(2.3513680692606915e-06) - actor/grad_norm:np.float64(0.2023781550427278) - perf/mfu/actor:np.float64(0.2271951501572258) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.28782272338867) - actor/lr:np.float64(1e-06) - training/global_step:99 - training/epoch:0 - critic/score/mean:0.5652777552604675 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5563547015190125 - critic/rewards/max:1.0216723680496216 - critic/rewards/min:-0.0349484421312809 - critic/advantages/mean:-0.09564045071601868 - critic/advantages/max:2.474825143814087 - critic/advantages/min:-2.4748129844665527 - critic/returns/mean:-0.09564045071601868 - critic/returns/max:2.474825143814087 - critic/returns/min:-2.4748129844665527 - response_length/mean:1099.1417236328125 - response_length/max:8192.0 - response_length/min:150.0 - response_length/clip_ratio:0.0069444444961845875 - response_length_non_aborted/mean:1099.1417236328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:150.0 - response_length_non_aborted/clip_ratio:0.0069444444961845875 - response/aborted_ratio:0.0 - prompt_length/mean:236.91111755371094 - prompt_length/max:381.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.885189890861511e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.213588871061802) - timing_s/agent_loop/generate_sequences/max:np.float64(27.655486594885588) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.826177373085557) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.655486594885588) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:248 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.012555052526295 - timing_s/reward:0.00023938901722431183 - timing_s/old_log_prob:8.880198625847697 - timing_s/ref:19.310700067318976 - timing_s/adv:0.09368852991610765 - timing_s/update_actor:19.29399090539664 - timing_s/update_weights:26.101242325268686 - timing_s/step:104.11500910762697 - timing_s/stop_profile:6.480235606431961e-05 - timing_per_token_ms/adv:9.739357634752001e-05 - timing_per_token_ms/update_actor:0.02005699927169028 - timing_per_token_ms/gen:0.037924232611464874 - timing_per_token_ms/ref:0.020074369221233127 - perf/total_num_tokens:1500859 - perf/time_per_step:104.11500910762697 - perf/throughput:3603.8487939056768 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:204.0 - frontier/mean_score:2.1305703124999997 - frontier/mean_frontier_pct:0.07636242839955584 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.11 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.53 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.3 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.339899999999999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.09 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.3 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.6569999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.237 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.3 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.51 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.51 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.3629999999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.09 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:99.0 - cluster/prob_snapshot/cluster_0:0.02280786027802122 - cluster/prob_snapshot/cluster_1:0.014667434262393067 - cluster/prob_snapshot/cluster_2:0.010927238525482835 - cluster/prob_snapshot/cluster_3:0.014667434262393067 - cluster/prob_snapshot/cluster_4:0.015327468804200753 - cluster/prob_snapshot/cluster_5:0.014667434262393067 - cluster/prob_snapshot/cluster_6:0.014667434262393067 - cluster/prob_snapshot/cluster_7:0.025888021473123763 - cluster/prob_snapshot/cluster_8:0.016867549401752027 - cluster/prob_snapshot/cluster_9:0.012467319123034106 - cluster/prob_snapshot/cluster_10:0.02126777968046995 - cluster/prob_snapshot/cluster_11:0.012467319123034106 - cluster/prob_snapshot/cluster_12:0.017160164715286762 - cluster/prob_snapshot/cluster_13:0.016867549401752027 - cluster/prob_snapshot/cluster_14:0.02024032591038931 - cluster/prob_snapshot/cluster_15:0.016867549401752027 - cluster/prob_snapshot/cluster_16:0.014667434262393067 - cluster/prob_snapshot/cluster_17:0.010927238525482835 - cluster/prob_snapshot/cluster_18:0.016867549401752027 - cluster/prob_snapshot/cluster_19:0.019485686417589188 - cluster/prob_snapshot/cluster_20:0.014667434262393067 - cluster/prob_snapshot/cluster_21:0.012467319123034106 - cluster/prob_snapshot/cluster_22:0.012467319123034106 - cluster/prob_snapshot/cluster_23:0.015327468804200753 - cluster/prob_snapshot/cluster_24:0.016867549401752027 - cluster/prob_snapshot/cluster_25:0.010927238525482835 - cluster/prob_snapshot/cluster_26:0.014667434262393067 - cluster/prob_snapshot/cluster_27:0.018407629999303298 - cluster/prob_snapshot/cluster_28:0.014667434262393067 - cluster/prob_snapshot/cluster_29:0.014667434262393067 - cluster/prob_snapshot/cluster_30:0.012467319123034106 - cluster/prob_snapshot/cluster_31:0.016867549401752027 - cluster/prob_snapshot/cluster_32:0.012467319123034106 - cluster/prob_snapshot/cluster_33:0.012467319123034106 - cluster/prob_snapshot/cluster_34:0.016867549401752027 - cluster/prob_snapshot/cluster_35:0.019485686417589188 - cluster/prob_snapshot/cluster_36:0.018407629999303298 - cluster/prob_snapshot/cluster_37:0.014667434262393067 - cluster/prob_snapshot/cluster_38:0.014667434262393067 - cluster/prob_snapshot/cluster_39:0.016405525222486648 - cluster/prob_snapshot/cluster_40:0.02830594801127926 - cluster/prob_snapshot/cluster_41:0.016867549401752027 - cluster/prob_snapshot/cluster_42:0.012467319123034106 - cluster/prob_snapshot/cluster_43:0.014667434262393067 - cluster/prob_snapshot/cluster_44:0.016867549401752027 - cluster/prob_snapshot/cluster_45:0.012467319123034106 - cluster/prob_snapshot/cluster_46:0.018407629999303298 - cluster/prob_snapshot/cluster_47:0.014667434262393067 - cluster/prob_snapshot/cluster_48:0.014667434262393067 - cluster/prob_snapshot/cluster_49:0.018407629999303298 - cluster/prob_snapshot/cluster_50:0.009849182107196944 - cluster/prob_snapshot/cluster_51:0.014007399720585378 - cluster/prob_snapshot/cluster_52:0.014667434262393067 - cluster/prob_snapshot/cluster_53:0.012467319123034106 - cluster/prob_snapshot/cluster_54:0.012467319123034106 - cluster/prob_snapshot/cluster_55:0.012467319123034106 - cluster/prob_snapshot/cluster_56:0.017329573581017405 - cluster/prob_snapshot/cluster_57:0.016867549401752027 - cluster/prob_snapshot/cluster_58:0.016867549401752027 - cluster/prob_snapshot/cluster_59:0.015327468804200753 - cluster/prob_snapshot/cluster_60:0.012467319123034106 - cluster/prob_snapshot/cluster_61:0.014667434262393067 - cluster/prob_snapshot/cluster_62:0.018407629999303298 - cluster/prob_snapshot/cluster_63:0.014667434262393067
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_100
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 100}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Updated best checkpoint at step 100: val-core/aime2025/acc/best@16/mean=0.259033
[36m(TaskRunner pid=2823680)[0m Training Progress:  12%|█▎        | 100/800 [3:05:10<32:25:25, 166.75s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:100 - global_seqlen/min:353121 - global_seqlen/max:376107 - global_seqlen/minmax_diff:22986 - global_seqlen/balanced_min:365092 - global_seqlen/balanced_max:365190 - global_seqlen/mean:365136.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.2288423142551134) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009730087593197823 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02309485244040843) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00021991652412604404) - actor/ppo_kl:np.float64(1.7288236791183447e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23734082281589508) - perf/mfu/actor:np.float64(0.19933108346921513) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.89387512207031) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.07291666666666667) - val-aux/aime2024/reward/std@16:np.float64(0.11051529393333274) - val-aux/aime2024/reward/best@2/mean:np.float64(0.11710000000000001) - val-aux/aime2024/reward/best@2/std:np.float64(0.12202649554744702) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.0298) - val-aux/aime2024/reward/worst@2/std:np.float64(0.06606852032756277) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.0724) - val-aux/aime2024/reward/maj@2/std:np.float64(0.11110609505171257) - val-aux/aime2024/reward/best@4/mean:np.float64(0.16843333333333332) - val-aux/aime2024/reward/best@4/std:np.float64(0.11702101273676638) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.0072) - val-aux/aime2024/reward/worst@4/std:np.float64(0.025918542638707527) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.0889) - val-aux/aime2024/reward/maj@4/std:np.float64(0.10522669565541289) - val-aux/aime2024/reward/best@8/mean:np.float64(0.21473333333333333) - val-aux/aime2024/reward/best@8/std:np.float64(0.09681599247883822) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0007333333333333333) - val-aux/aime2024/reward/worst@8/std:np.float64(0.006535668472973334) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.10353333333333332) - val-aux/aime2024/reward/maj@8/std:np.float64(0.09307051722306327) - val-aux/aime2024/reward/best@16/mean:np.float64(0.2579333333333333) - val-aux/aime2024/reward/best@16/std:np.float64(0.06844035612363658) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.11543333333333332) - val-aux/aime2024/reward/maj@16/std:np.float64(0.07501587070696417) - val-aux/aime2024/score/mean@16:np.float64(0.07291666666666667) - val-aux/aime2024/score/std@16:np.float64(0.11051529393333274) - val-aux/aime2024/score/best@2/mean:np.float64(0.11710000000000001) - val-aux/aime2024/score/best@2/std:np.float64(0.12202649554744702) - val-aux/aime2024/score/worst@2/mean:np.float64(0.0298) - val-aux/aime2024/score/worst@2/std:np.float64(0.06606852032756277) - val-aux/aime2024/score/maj@2/mean:np.float64(0.0724) - val-aux/aime2024/score/maj@2/std:np.float64(0.11110609505171257) - val-aux/aime2024/score/best@4/mean:np.float64(0.16843333333333332) - val-aux/aime2024/score/best@4/std:np.float64(0.11702101273676638) - val-aux/aime2024/score/worst@4/mean:np.float64(0.0072) - val-aux/aime2024/score/worst@4/std:np.float64(0.025918542638707527) - val-aux/aime2024/score/maj@4/mean:np.float64(0.0889) - val-aux/aime2024/score/maj@4/std:np.float64(0.10522669565541289) - val-aux/aime2024/score/best@8/mean:np.float64(0.21473333333333333) - val-aux/aime2024/score/best@8/std:np.float64(0.09681599247883822) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0007333333333333333) - val-aux/aime2024/score/worst@8/std:np.float64(0.006535668472973334) - val-aux/aime2024/score/maj@8/mean:np.float64(0.10353333333333332) - val-aux/aime2024/score/maj@8/std:np.float64(0.09307051722306327) - val-aux/aime2024/score/best@16/mean:np.float64(0.2579333333333333) - val-aux/aime2024/score/best@16/std:np.float64(0.06844035612363658) - val-aux/aime2024/score/worst@16/mean:np.float64(0.0) - val-aux/aime2024/score/worst@16/std:np.float64(0.0) - val-aux/aime2024/score/maj@16/mean:np.float64(0.11543333333333332) - val-aux/aime2024/score/maj@16/std:np.float64(0.07501587070696417) - val-core/aime2024/acc/mean@16:np.float64(0.07291666666666667) - val-aux/aime2024/acc/std@16:np.float64(0.11051529393333274) - val-aux/aime2024/acc/best@2/mean:np.float64(0.11710000000000001) - val-aux/aime2024/acc/best@2/std:np.float64(0.12202649554744702) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.0298) - val-aux/aime2024/acc/worst@2/std:np.float64(0.06606852032756277) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.0724) - val-aux/aime2024/acc/maj@2/std:np.float64(0.11110609505171257) - val-aux/aime2024/acc/best@4/mean:np.float64(0.16843333333333332) - val-aux/aime2024/acc/best@4/std:np.float64(0.11702101273676638) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.0072) - val-aux/aime2024/acc/worst@4/std:np.float64(0.025918542638707527) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.0889) - val-aux/aime2024/acc/maj@4/std:np.float64(0.10522669565541289) - val-aux/aime2024/acc/best@8/mean:np.float64(0.21473333333333333) - val-aux/aime2024/acc/best@8/std:np.float64(0.09681599247883822) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0007333333333333333) - val-aux/aime2024/acc/worst@8/std:np.float64(0.006535668472973334) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.10353333333333332) - val-aux/aime2024/acc/maj@8/std:np.float64(0.09307051722306327) - val-core/aime2024/acc/best@16/mean:np.float64(0.2579333333333333) - val-core/aime2024/acc/best@16/std:np.float64(0.06844035612363658) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0) - val-core/aime2024/acc/maj@16/mean:np.float64(0.11543333333333332) - val-core/aime2024/acc/maj@16/std:np.float64(0.07501587070696417) - val-aux/aime2025/reward/mean@16:np.float64(0.075) - val-aux/aime2025/reward/std@16:np.float64(0.11303097869136272) - val-aux/aime2025/reward/best@2/mean:np.float64(0.12026666666666667) - val-aux/aime2025/reward/best@2/std:np.float64(0.12293479659915928) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0272) - val-aux/aime2025/reward/worst@2/std:np.float64(0.06464278709495463) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.0737) - val-aux/aime2025/reward/maj@2/std:np.float64(0.11273107654299482) - val-aux/aime2025/reward/best@4/mean:np.float64(0.17296666666666669) - val-aux/aime2025/reward/best@4/std:np.float64(0.11555919982117263) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0064) - val-aux/aime2025/reward/worst@4/std:np.float64(0.026056613638061114) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.09086666666666665) - val-aux/aime2025/reward/maj@4/std:np.float64(0.1090674671461239) - val-aux/aime2025/reward/best@8/mean:np.float64(0.21883333333333332) - val-aux/aime2025/reward/best@8/std:np.float64(0.09382863615353057) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0004) - val-aux/aime2025/reward/worst@8/std:np.float64(0.004530313536682326) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.108) - val-aux/aime2025/reward/maj@8/std:np.float64(0.09508918365000583) - val-aux/aime2025/reward/best@16/mean:np.float64(0.25903333333333334) - val-aux/aime2025/reward/best@16/std:np.float64(0.06690465714500902) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.12466666666666665) - val-aux/aime2025/reward/maj@16/std:np.float64(0.07413997539344604) - val-aux/aime2025/score/mean@16:np.float64(0.075) - val-aux/aime2025/score/std@16:np.float64(0.11303097869136272) - val-aux/aime2025/score/best@2/mean:np.float64(0.12026666666666667) - val-aux/aime2025/score/best@2/std:np.float64(0.12293479659915928) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0272) - val-aux/aime2025/score/worst@2/std:np.float64(0.06464278709495463) - val-aux/aime2025/score/maj@2/mean:np.float64(0.0737) - val-aux/aime2025/score/maj@2/std:np.float64(0.11273107654299482) - val-aux/aime2025/score/best@4/mean:np.float64(0.17296666666666669) - val-aux/aime2025/score/best@4/std:np.float64(0.11555919982117263) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0064) - val-aux/aime2025/score/worst@4/std:np.float64(0.026056613638061114) - val-aux/aime2025/score/maj@4/mean:np.float64(0.09086666666666665) - val-aux/aime2025/score/maj@4/std:np.float64(0.1090674671461239) - val-aux/aime2025/score/best@8/mean:np.float64(0.21883333333333332) - val-aux/aime2025/score/best@8/std:np.float64(0.09382863615353057) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0004) - val-aux/aime2025/score/worst@8/std:np.float64(0.004530313536682326) - val-aux/aime2025/score/maj@8/mean:np.float64(0.108) - val-aux/aime2025/score/maj@8/std:np.float64(0.09508918365000583) - val-aux/aime2025/score/best@16/mean:np.float64(0.25903333333333334) - val-aux/aime2025/score/best@16/std:np.float64(0.06690465714500902) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.12466666666666665) - val-aux/aime2025/score/maj@16/std:np.float64(0.07413997539344604) - val-core/aime2025/acc/mean@16:np.float64(0.075) - val-aux/aime2025/acc/std@16:np.float64(0.11303097869136272) - val-aux/aime2025/acc/best@2/mean:np.float64(0.12026666666666667) - val-aux/aime2025/acc/best@2/std:np.float64(0.12293479659915928) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0272) - val-aux/aime2025/acc/worst@2/std:np.float64(0.06464278709495463) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.0737) - val-aux/aime2025/acc/maj@2/std:np.float64(0.11273107654299482) - val-aux/aime2025/acc/best@4/mean:np.float64(0.17296666666666669) - val-aux/aime2025/acc/best@4/std:np.float64(0.11555919982117263) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0064) - val-aux/aime2025/acc/worst@4/std:np.float64(0.026056613638061114) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.09086666666666665) - val-aux/aime2025/acc/maj@4/std:np.float64(0.1090674671461239) - val-aux/aime2025/acc/best@8/mean:np.float64(0.21883333333333332) - val-aux/aime2025/acc/best@8/std:np.float64(0.09382863615353057) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0004) - val-aux/aime2025/acc/worst@8/std:np.float64(0.004530313536682326) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.108) - val-aux/aime2025/acc/maj@8/std:np.float64(0.09508918365000583) - val-core/aime2025/acc/best@16/mean:np.float64(0.25903333333333334) - val-core/aime2025/acc/best@16/std:np.float64(0.06690465714500902) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.12466666666666665) - val-core/aime2025/acc/maj@16/std:np.float64(0.07413997539344604) - val-aux/math500/reward/mean@4:np.float64(0.6975) - val-aux/math500/reward/std@4:np.float64(0.12454036255195278) - val-aux/math500/reward/best@2/mean:np.float64(0.7539020000000001) - val-aux/math500/reward/best@2/std:np.float64(0.10047042585637339) - val-aux/math500/reward/worst@2/mean:np.float64(0.641168) - val-aux/math500/reward/worst@2/std:np.float64(0.11336999997260014) - val-aux/math500/reward/maj@2/mean:np.float64(0.6975399999999999) - val-aux/math500/reward/maj@2/std:np.float64(0.1244366889489385) - val-aux/math500/reward/best@4/mean:np.float64(0.793644) - val-aux/math500/reward/best@4/std:np.float64(0.06039753315287231) - val-aux/math500/reward/worst@4/mean:np.float64(0.5922360000000001) - val-aux/math500/reward/worst@4/std:np.float64(0.08083984472973549) - val-aux/math500/reward/maj@4/mean:np.float64(0.71196) - val-aux/math500/reward/maj@4/std:np.float64(0.11344612846032773) - val-aux/math500/score/mean@4:np.float64(0.6975) - val-aux/math500/score/std@4:np.float64(0.12454036255195278) - val-aux/math500/score/best@2/mean:np.float64(0.7539020000000001) - val-aux/math500/score/best@2/std:np.float64(0.10047042585637339) - val-aux/math500/score/worst@2/mean:np.float64(0.641168) - val-aux/math500/score/worst@2/std:np.float64(0.11336999997260014) - val-aux/math500/score/maj@2/mean:np.float64(0.6975399999999999) - val-aux/math500/score/maj@2/std:np.float64(0.1244366889489385) - val-aux/math500/score/best@4/mean:np.float64(0.793644) - val-aux/math500/score/best@4/std:np.float64(0.06039753315287231) - val-aux/math500/score/worst@4/mean:np.float64(0.5922360000000001) - val-aux/math500/score/worst@4/std:np.float64(0.08083984472973549) - val-aux/math500/score/maj@4/mean:np.float64(0.71196) - val-aux/math500/score/maj@4/std:np.float64(0.11344612846032773) - val-core/math500/acc/mean@4:np.float64(0.6975) - val-aux/math500/acc/std@4:np.float64(0.12454036255195278) - val-aux/math500/acc/best@2/mean:np.float64(0.7539020000000001) - val-aux/math500/acc/best@2/std:np.float64(0.10047042585637339) - val-aux/math500/acc/worst@2/mean:np.float64(0.641168) - val-aux/math500/acc/worst@2/std:np.float64(0.11336999997260014) - val-aux/math500/acc/maj@2/mean:np.float64(0.6975399999999999) - val-aux/math500/acc/maj@2/std:np.float64(0.1244366889489385) - val-core/math500/acc/best@4/mean:np.float64(0.793644) - val-core/math500/acc/best@4/std:np.float64(0.06039753315287231) - val-aux/math500/acc/worst@4/mean:np.float64(0.5922360000000001) - val-aux/math500/acc/worst@4/std:np.float64(0.08083984472973549) - val-core/math500/acc/maj@4/mean:np.float64(0.71196) - val-core/math500/acc/maj@4/std:np.float64(0.11344612846032773) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.05304054054054054 - val-aux/aime2024/response_length/clip_ratio:0.11875 - val-aux/aime2025/response_length/clip_ratio:0.10833333333333334 - val-aux/math500/response_length/clip_ratio:0.024 - val-best/metric:0.25903333333333334 - val-best/step:100.0 - training/global_step:100 - training/epoch:0 - critic/score/mean:0.5651041865348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5565673112869263 - critic/rewards/max:1.0010982751846313 - critic/rewards/min:-0.055582717061042786 - critic/advantages/mean:-0.07977275550365448 - critic/advantages/max:2.4748313426971436 - critic/advantages/min:-2.4748499393463135 - critic/returns/mean:-0.07977275550365448 - critic/returns/max:2.4748313426971436 - critic/returns/min:-2.4748499393463135 - response_length/mean:1085.0963134765625 - response_length/max:8192.0 - response_length/min:221.0 - response_length/clip_ratio:0.0026041667442768812 - response_length_non_aborted/mean:1085.0963134765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:221.0 - response_length_non_aborted/clip_ratio:0.0026041667442768812 - response/aborted_ratio:0.0 - prompt_length/mean:224.15625 - prompt_length/max:355.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.835364133119583e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.161417412571609) - timing_s/agent_loop/generate_sequences/max:np.float64(28.12153502367437) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.002880585272578) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.12153502367437) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:185 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.4396541537717 - timing_s/reward:0.00013942550867795944 - timing_s/old_log_prob:9.638793433085084 - timing_s/ref:19.899007105268538 - timing_s/adv:0.08620079047977924 - timing_s/update_actor:21.359934855252504 - timing_s/save_checkpoint:77.19161368906498 - timing_s/update_weights:25.01614058855921 - timing_s/step:183.03809204418212 - timing_s/testing:126.92606109660119 - timing_s/stop_profile:0.00010773725807666779 - timing_per_token_ms/adv:8.572876788381098e-05 - timing_per_token_ms/update_actor:0.02124297105661478 - timing_per_token_ms/gen:0.03532670888214576 - timing_per_token_ms/ref:0.019790043127806835 - perf/total_num_tokens:1460545 - perf/time_per_step:183.03809204418212 - perf/throughput:1994.8648170560182 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:236.0 - frontier/mean_score:2.1629281249999996 - frontier/mean_frontier_pct:0.08973155497654567 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.11 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.3709999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.339899999999999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.7598999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.09 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.3 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.7598999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.237 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.3 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.51 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.3629999999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.91 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.3629999999999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:100.0 - cluster/prob_snapshot/cluster_0:0.022466650388579142 - cluster/prob_snapshot/cluster_1:0.014448006680758291 - cluster/prob_snapshot/cluster_2:0.010763764977164927 - cluster/prob_snapshot/cluster_3:0.016615207682872033 - cluster/prob_snapshot/cluster_4:0.015098166981392413 - cluster/prob_snapshot/cluster_5:0.014448006680758291 - cluster/prob_snapshot/cluster_6:0.016615207682872033 - cluster/prob_snapshot/cluster_7:0.024352115260418097 - cluster/prob_snapshot/cluster_8:0.018132248384351655 - cluster/prob_snapshot/cluster_9:0.010763764977164927 - cluster/prob_snapshot/cluster_10:0.025500731791538382 - cluster/prob_snapshot/cluster_11:0.012280805678644547 - cluster/prob_snapshot/cluster_12:0.016903445416153157 - cluster/prob_snapshot/cluster_13:0.016615207682872033 - cluster/prob_snapshot/cluster_14:0.0199375268191124 - cluster/prob_snapshot/cluster_15:0.018132248384351655 - cluster/prob_snapshot/cluster_16:0.016615207682872033 - cluster/prob_snapshot/cluster_17:0.010763764977164927 - cluster/prob_snapshot/cluster_18:0.018132248384351655 - cluster/prob_snapshot/cluster_19:0.0199375268191124 - cluster/prob_snapshot/cluster_20:0.014448006680758291 - cluster/prob_snapshot/cluster_21:0.012280805678644547 - cluster/prob_snapshot/cluster_22:0.012280805678644547 - cluster/prob_snapshot/cluster_23:0.015098166981392413 - cluster/prob_snapshot/cluster_24:0.016615207682872033 - cluster/prob_snapshot/cluster_25:0.010763764977164927 - cluster/prob_snapshot/cluster_26:0.014448006680758291 - cluster/prob_snapshot/cluster_27:0.018132248384351655 - cluster/prob_snapshot/cluster_28:0.014448006680758291 - cluster/prob_snapshot/cluster_29:0.014448006680758291 - cluster/prob_snapshot/cluster_30:0.012280805678644547 - cluster/prob_snapshot/cluster_31:0.016615207682872033 - cluster/prob_snapshot/cluster_32:0.012280805678644547 - cluster/prob_snapshot/cluster_33:0.012280805678644547 - cluster/prob_snapshot/cluster_34:0.016615207682872033 - cluster/prob_snapshot/cluster_35:0.0199375268191124 - cluster/prob_snapshot/cluster_36:0.018132248384351655 - cluster/prob_snapshot/cluster_37:0.014448006680758291 - cluster/prob_snapshot/cluster_38:0.014448006680758291 - cluster/prob_snapshot/cluster_39:0.01616009547242815 - cluster/prob_snapshot/cluster_40:0.02788248569286139 - cluster/prob_snapshot/cluster_41:0.016615207682872033 - cluster/prob_snapshot/cluster_42:0.012280805678644547 - cluster/prob_snapshot/cluster_43:0.014448006680758291 - cluster/prob_snapshot/cluster_44:0.016615207682872033 - cluster/prob_snapshot/cluster_45:0.012280805678644547 - cluster/prob_snapshot/cluster_46:0.018132248384351655 - cluster/prob_snapshot/cluster_47:0.014448006680758291 - cluster/prob_snapshot/cluster_48:0.014448006680758291 - cluster/prob_snapshot/cluster_49:0.019194176875387388 - cluster/prob_snapshot/cluster_50:0.008958486542404179 - cluster/prob_snapshot/cluster_51:0.013797846380124167 - cluster/prob_snapshot/cluster_52:0.014448006680758291 - cluster/prob_snapshot/cluster_53:0.012280805678644547 - cluster/prob_snapshot/cluster_54:0.012280805678644547 - cluster/prob_snapshot/cluster_55:0.012280805678644547 - cluster/prob_snapshot/cluster_56:0.017070319893315918 - cluster/prob_snapshot/cluster_57:0.013797846380124167 - cluster/prob_snapshot/cluster_58:0.016615207682872033 - cluster/prob_snapshot/cluster_59:0.017070319893315918 - cluster/prob_snapshot/cluster_60:0.012280805678644547 - cluster/prob_snapshot/cluster_61:0.014448006680758291 - cluster/prob_snapshot/cluster_62:0.019194176875387388 - cluster/prob_snapshot/cluster_63:0.014448006680758291
[36m(TaskRunner pid=2823680)[0m Training Progress:  13%|█▎        | 101/800 [3:06:53<28:42:13, 147.83s/it]
[36m(TaskRunner pid=2823680)[0m step:101 - global_seqlen/min:334786 - global_seqlen/max:445246 - global_seqlen/minmax_diff:110460 - global_seqlen/balanced_min:385045 - global_seqlen/balanced_max:385131 - global_seqlen/mean:385090.25 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.23152618821371684) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010048165917396545 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.016350382065866143) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004818078030604573) - actor/ppo_kl:np.float64(4.3216724063458145e-05) - actor/pg_clipfrac_lower:np.float64(1.5978117020991207e-06) - actor/grad_norm:np.float64(0.21218357980251312) - perf/mfu/actor:np.float64(0.2466554501688205) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.96241760253906) - actor/lr:np.float64(1e-06) - training/global_step:101 - training/epoch:0 - critic/score/mean:0.5795454382896423 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5706354379653931 - critic/rewards/max:1.0086829662322998 - critic/rewards/min:-0.04253676161170006 - critic/advantages/mean:-0.10780280083417892 - critic/advantages/max:2.474862575531006 - critic/advantages/min:-2.4748520851135254 - critic/returns/mean:-0.10780280083417892 - critic/returns/max:2.474862575531006 - critic/returns/min:-2.4748520851135254 - response_length/mean:1120.1746826171875 - response_length/max:8192.0 - response_length/min:261.0 - response_length/clip_ratio:0.004261363763362169 - response_length_non_aborted/mean:1120.1746826171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:261.0 - response_length_non_aborted/clip_ratio:0.004261363763362169 - response/aborted_ratio:0.0 - prompt_length/mean:231.01136779785156 - prompt_length/max:373.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.055327296257019e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.976774706505239) - timing_s/agent_loop/generate_sequences/max:np.float64(26.81892133038491) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.0862337520329675) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(26.81892133038491) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:332 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.32391487620771 - timing_s/reward:0.0002033337950706482 - timing_s/old_log_prob:8.916251267306507 - timing_s/ref:19.88545506633818 - timing_s/adv:0.07427819538861513 - timing_s/update_actor:18.173464852385223 - timing_s/update_weights:26.20551747828722 - timing_s/step:102.9641030896455 - timing_s/stop_profile:5.0825998187065125e-05 - timing_per_token_ms/adv:7.808606221240296e-05 - timing_per_token_ms/update_actor:0.01910512633827101 - timing_per_token_ms/gen:0.03718463520454235 - timing_per_token_ms/ref:0.020904881618462507 - perf/total_num_tokens:1540361 - perf/time_per_step:102.9641030896455 - perf/throughput:3740.0437477197456 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:276.0 - frontier/mean_score:2.15828328125 - frontier/mean_frontier_pct:0.11057403969898677 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.763 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.6596999999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.3709999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.339899999999999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.6569999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.4319299999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.09 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.51 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.3 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.91 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.7598999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.237 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:2.09 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.91 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.51 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.3629999999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.91 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.3629999999999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:101.0 - cluster/prob_snapshot/cluster_0:0.022276095736679602 - cluster/prob_snapshot/cluster_1:0.01230723521363514 - cluster/prob_snapshot/cluster_2:0.010786929687244918 - cluster/prob_snapshot/cluster_3:0.016650965289035778 - cluster/prob_snapshot/cluster_4:0.012763326871552207 - cluster/prob_snapshot/cluster_5:0.014479100251335461 - cluster/prob_snapshot/cluster_6:0.016650965289035778 - cluster/prob_snapshot/cluster_7:0.01925503146923846 - cluster/prob_snapshot/cluster_8:0.019235484683899157 - cluster/prob_snapshot/cluster_9:0.010786929687244918 - cluster/prob_snapshot/cluster_10:0.024404523473625917 - cluster/prob_snapshot/cluster_11:0.01230723521363514 - cluster/prob_snapshot/cluster_12:0.016939823339049916 - cluster/prob_snapshot/cluster_13:0.016650965289035778 - cluster/prob_snapshot/cluster_14:0.019980434391830365 - cluster/prob_snapshot/cluster_15:0.023579214759299794 - cluster/prob_snapshot/cluster_16:0.018171270815426 - cluster/prob_snapshot/cluster_17:0.010786929687244918 - cluster/prob_snapshot/cluster_18:0.019235484683899157 - cluster/prob_snapshot/cluster_19:0.02484562926278285 - cluster/prob_snapshot/cluster_20:0.014479100251335461 - cluster/prob_snapshot/cluster_21:0.01230723521363514 - cluster/prob_snapshot/cluster_22:0.01230723521363514 - cluster/prob_snapshot/cluster_23:0.015130659762645556 - cluster/prob_snapshot/cluster_24:0.018171270815426 - cluster/prob_snapshot/cluster_25:0.010786929687244918 - cluster/prob_snapshot/cluster_26:0.01230723521363514 - cluster/prob_snapshot/cluster_27:0.018171270815426 - cluster/prob_snapshot/cluster_28:0.014479100251335461 - cluster/prob_snapshot/cluster_29:0.014479100251335461 - cluster/prob_snapshot/cluster_30:0.01230723521363514 - cluster/prob_snapshot/cluster_31:0.016650965289035778 - cluster/prob_snapshot/cluster_32:0.010786929687244918 - cluster/prob_snapshot/cluster_33:0.01230723521363514 - cluster/prob_snapshot/cluster_34:0.013827540740025365 - cluster/prob_snapshot/cluster_35:0.019980434391830365 - cluster/prob_snapshot/cluster_36:0.018171270815426 - cluster/prob_snapshot/cluster_37:0.014479100251335461 - cluster/prob_snapshot/cluster_38:0.014479100251335461 - cluster/prob_snapshot/cluster_39:0.016194873631118715 - cluster/prob_snapshot/cluster_40:0.027942491620039738 - cluster/prob_snapshot/cluster_41:0.016650965289035778 - cluster/prob_snapshot/cluster_42:0.015130659762645556 - cluster/prob_snapshot/cluster_43:0.014479100251335461 - cluster/prob_snapshot/cluster_44:0.013827540740025365 - cluster/prob_snapshot/cluster_45:0.01230723521363514 - cluster/prob_snapshot/cluster_46:0.018171270815426 - cluster/prob_snapshot/cluster_47:0.014479100251335461 - cluster/prob_snapshot/cluster_48:0.014479100251335461 - cluster/prob_snapshot/cluster_49:0.019235484683899157 - cluster/prob_snapshot/cluster_50:0.008977766110840552 - cluster/prob_snapshot/cluster_51:0.013827540740025365 - cluster/prob_snapshot/cluster_52:0.014479100251335461 - cluster/prob_snapshot/cluster_53:0.01230723521363514 - cluster/prob_snapshot/cluster_54:0.01230723521363514 - cluster/prob_snapshot/cluster_55:0.01230723521363514 - cluster/prob_snapshot/cluster_56:0.017107056946952845 - cluster/prob_snapshot/cluster_57:0.013827540740025365 - cluster/prob_snapshot/cluster_58:0.016650965289035778 - cluster/prob_snapshot/cluster_59:0.017107056946952845 - cluster/prob_snapshot/cluster_60:0.01230723521363514 - cluster/prob_snapshot/cluster_61:0.014479100251335461 - cluster/prob_snapshot/cluster_62:0.019235484683899157 - cluster/prob_snapshot/cluster_63:0.014479100251335461
[36m(TaskRunner pid=2823680)[0m Training Progress:  13%|█▎        | 102/800 [3:08:35<25:59:55, 134.09s/it]
[36m(TaskRunner pid=2823680)[0m step:102 - global_seqlen/min:294904 - global_seqlen/max:399069 - global_seqlen/minmax_diff:104165 - global_seqlen/balanced_min:346924 - global_seqlen/balanced_max:346980 - global_seqlen/mean:346957.75 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.2535443100333214) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010321086272597313 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06932312992285006) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00032730419392464684) - actor/ppo_kl:np.float64(-1.111952497240054e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2200810553935858) - perf/mfu/actor:np.float64(0.1977571462535641) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.62070846557617) - actor/lr:np.float64(1e-06) - training/global_step:102 - training/epoch:0 - critic/score/mean:0.5924999713897705 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5842151045799255 - critic/rewards/max:1.006773829460144 - critic/rewards/min:-0.04904201254248619 - critic/advantages/mean:-0.13304248452186584 - critic/advantages/max:2.474848985671997 - critic/advantages/min:-2.4748377799987793 - critic/returns/mean:-0.13304248452186584 - critic/returns/max:2.474848985671997 - critic/returns/min:-2.4748377799987793 - response_length/mean:1057.5400390625 - response_length/max:8192.0 - response_length/min:158.0 - response_length/clip_ratio:0.004999999888241291 - response_length_non_aborted/mean:1057.5400390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:158.0 - response_length_non_aborted/clip_ratio:0.004999999888241291 - response/aborted_ratio:0.0 - prompt_length/mean:228.35000610351562 - prompt_length/max:346.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.842760533094406e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.276102832518518) - timing_s/agent_loop/generate_sequences/max:np.float64(28.027788958512247) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.049539828449269) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.027788958512247) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.221405997872353 - timing_s/reward:0.00021032709628343582 - timing_s/old_log_prob:9.315357400104403 - timing_s/ref:16.447378562763333 - timing_s/adv:0.08735383488237858 - timing_s/update_actor:20.54250395204872 - timing_s/update_weights:24.563484290614724 - timing_s/step:101.59299855399877 - timing_s/stop_profile:6.759818643331528e-05 - timing_per_token_ms/adv:8.491573431862229e-05 - timing_per_token_ms/update_actor:0.019969149725140484 - timing_per_token_ms/gen:0.03572135096293326 - timing_per_token_ms/ref:0.015988321865364974 - perf/total_num_tokens:1387831 - perf/time_per_step:101.59299855399877 - perf/throughput:3415.1738302672975 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:304.0 - frontier/mean_score:2.1942357142857145 - frontier/mean_frontier_pct:0.12244187571058691 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0538999999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.763 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.91 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.6596999999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.3709999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.339899999999999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.8319299999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.4319299999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.09 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.91 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.7598999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.237 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:4.201789999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:2.09 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.91 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.51 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.3629999999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.91 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.51 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.3629999999999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.3 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:102.0 - cluster/prob_snapshot/cluster_0:0.022091793903000534 - cluster/prob_snapshot/cluster_1:0.012297733925505391 - cluster/prob_snapshot/cluster_2:0.009715209801149259 - cluster/prob_snapshot/cluster_3:0.016638110605095527 - cluster/prob_snapshot/cluster_4:0.012753473476862355 - cluster/prob_snapshot/cluster_5:0.01446792226530046 - cluster/prob_snapshot/cluster_6:0.013816865763361939 - cluster/prob_snapshot/cluster_7:0.019240166424509812 - cluster/prob_snapshot/cluster_8:0.01922063472945166 - cluster/prob_snapshot/cluster_9:0.010778602087648843 - cluster/prob_snapshot/cluster_10:0.024385682978163924 - cluster/prob_snapshot/cluster_11:0.012297733925505391 - cluster/prob_snapshot/cluster_12:0.01692674565428827 - cluster/prob_snapshot/cluster_13:0.016638110605095527 - cluster/prob_snapshot/cluster_14:0.020486071550386162 - cluster/prob_snapshot/cluster_15:0.023003273005714465 - cluster/prob_snapshot/cluster_16:0.018157242442952075 - cluster/prob_snapshot/cluster_17:0.010778602087648843 - cluster/prob_snapshot/cluster_18:0.01996500933000137 - cluster/prob_snapshot/cluster_19:0.0248264482299763 - cluster/prob_snapshot/cluster_20:0.016638110605095527 - cluster/prob_snapshot/cluster_21:0.012297733925505391 - cluster/prob_snapshot/cluster_22:0.012297733925505391 - cluster/prob_snapshot/cluster_23:0.01511897876723898 - cluster/prob_snapshot/cluster_24:0.01922063472945166 - cluster/prob_snapshot/cluster_25:0.010778602087648843 - cluster/prob_snapshot/cluster_26:0.012297733925505391 - cluster/prob_snapshot/cluster_27:0.018157242442952075 - cluster/prob_snapshot/cluster_28:0.016638110605095527 - cluster/prob_snapshot/cluster_29:0.01446792226530046 - cluster/prob_snapshot/cluster_30:0.012297733925505391 - cluster/prob_snapshot/cluster_31:0.018157242442952075 - cluster/prob_snapshot/cluster_32:0.010778602087648843 - cluster/prob_snapshot/cluster_33:0.012297733925505391 - cluster/prob_snapshot/cluster_34:0.013816865763361939 - cluster/prob_snapshot/cluster_35:0.01996500933000137 - cluster/prob_snapshot/cluster_36:0.018157242442952075 - cluster/prob_snapshot/cluster_37:0.01446792226530046 - cluster/prob_snapshot/cluster_38:0.01446792226530046 - cluster/prob_snapshot/cluster_39:0.016182371053738567 - cluster/prob_snapshot/cluster_40:0.030395585547558404 - cluster/prob_snapshot/cluster_41:0.018157242442952075 - cluster/prob_snapshot/cluster_42:0.01511897876723898 - cluster/prob_snapshot/cluster_43:0.01446792226530046 - cluster/prob_snapshot/cluster_44:0.013816865763361939 - cluster/prob_snapshot/cluster_45:0.012297733925505391 - cluster/prob_snapshot/cluster_46:0.018157242442952075 - cluster/prob_snapshot/cluster_47:0.01446792226530046 - cluster/prob_snapshot/cluster_48:0.01446792226530046 - cluster/prob_snapshot/cluster_49:0.01922063472945166 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013816865763361939 - cluster/prob_snapshot/cluster_52:0.01446792226530046 - cluster/prob_snapshot/cluster_53:0.010778602087648843 - cluster/prob_snapshot/cluster_54:0.012297733925505391 - cluster/prob_snapshot/cluster_55:0.012297733925505391 - cluster/prob_snapshot/cluster_56:0.01709385015645249 - cluster/prob_snapshot/cluster_57:0.013816865763361939 - cluster/prob_snapshot/cluster_58:0.018157242442952075 - cluster/prob_snapshot/cluster_59:0.01709385015645249 - cluster/prob_snapshot/cluster_60:0.012297733925505391 - cluster/prob_snapshot/cluster_61:0.01446792226530046 - cluster/prob_snapshot/cluster_62:0.01922063472945166 - cluster/prob_snapshot/cluster_63:0.016638110605095527
[36m(TaskRunner pid=2823680)[0m Training Progress:  13%|█▎        | 103/800 [3:10:26<24:34:45, 126.95s/it]
[36m(TaskRunner pid=2823680)[0m step:103 - global_seqlen/min:336040 - global_seqlen/max:439590 - global_seqlen/minmax_diff:103550 - global_seqlen/balanced_min:399809 - global_seqlen/balanced_max:399868 - global_seqlen/mean:399845.5 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.259440291672945) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009148952551186085 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.040075776596495416) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00042633043579806565) - actor/ppo_kl:np.float64(5.365958707384182e-05) - actor/pg_clipfrac_lower:np.float64(9.712154243733613e-07) - actor/grad_norm:np.float64(0.20892967656254768) - perf/mfu/actor:np.float64(0.2400655835388696) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.448974609375) - actor/lr:np.float64(1e-06) - training/global_step:103 - training/epoch:0 - critic/score/mean:0.532608687877655 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5240276455879211 - critic/rewards/max:1.0070841312408447 - critic/rewards/min:-0.042673468589782715 - critic/advantages/mean:-0.11866806447505951 - critic/advantages/max:2.4748432636260986 - critic/advantages/min:-2.4748222827911377 - critic/returns/mean:-0.11866806447505951 - critic/returns/max:2.4748432636260986 - critic/returns/min:-2.4748222827911377 - response_length/mean:1244.0068359375 - response_length/max:8192.0 - response_length/min:190.0 - response_length/clip_ratio:0.009510869160294533 - response_length_non_aborted/mean:1244.0068359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:190.0 - response_length_non_aborted/clip_ratio:0.009510869160294533 - response/aborted_ratio:0.0 - prompt_length/mean:244.77174377441406 - prompt_length/max:417.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.712895214557648e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5272900816053152) - timing_s/agent_loop/generate_sequences/max:np.float64(28.901518645696342) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.344626718415384) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.901518645696342) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:208 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.69952298887074 - timing_s/reward:0.00020808540284633636 - timing_s/old_log_prob:9.676069082692266 - timing_s/ref:21.743080280721188 - timing_s/adv:0.0981528740376234 - timing_s/update_actor:19.482627340592444 - timing_s/update_weights:27.834199514240026 - timing_s/step:109.99026811122894 - timing_s/stop_profile:7.431674748659134e-05 - timing_per_token_ms/adv:8.957671022406153e-05 - timing_per_token_ms/update_actor:0.017780321572883048 - timing_per_token_ms/gen:0.033529807576184005 - timing_per_token_ms/ref:0.019843266137455098 - perf/total_num_tokens:1599382 - perf/time_per_step:109.99026811122894 - perf/throughput:3635.28071043205 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:340.0 - frontier/mean_score:2.2061741428571424 - frontier/mean_frontier_pct:0.14882542653330322 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0538999999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.763 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.91 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.6596999999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.8596999999999997 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.339899999999999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.8823509999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.4319299999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.3629999999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.91 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.7598999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.4659 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:4.201789999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.91 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:1.7 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.5540999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.91 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.51 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.5540999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.7598999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.3 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:103.0 - cluster/prob_snapshot/cluster_0:0.021972246992173216 - cluster/prob_snapshot/cluster_1:0.012231186314776014 - cluster/prob_snapshot/cluster_2:0.009662637188673051 - cluster/prob_snapshot/cluster_3:0.01654807560234402 - cluster/prob_snapshot/cluster_4:0.012684459689970656 - cluster/prob_snapshot/cluster_5:0.014389630958560018 - cluster/prob_snapshot/cluster_6:0.013742097565424816 - cluster/prob_snapshot/cluster_7:0.019136050730241036 - cluster/prob_snapshot/cluster_8:0.019116624728446982 - cluster/prob_snapshot/cluster_9:0.010720275064127214 - cluster/prob_snapshot/cluster_10:0.02776982930537705 - cluster/prob_snapshot/cluster_11:0.012231186314776014 - cluster/prob_snapshot/cluster_12:0.016835148739967287 - cluster/prob_snapshot/cluster_13:0.01805898685299282 - cluster/prob_snapshot/cluster_14:0.020737983591518207 - cluster/prob_snapshot/cluster_15:0.022878793742562496 - cluster/prob_snapshot/cluster_16:0.01805898685299282 - cluster/prob_snapshot/cluster_17:0.010720275064127214 - cluster/prob_snapshot/cluster_18:0.019856971241264895 - cluster/prob_snapshot/cluster_19:0.024692103087805437 - cluster/prob_snapshot/cluster_20:0.01654807560234402 - cluster/prob_snapshot/cluster_21:0.012231186314776014 - cluster/prob_snapshot/cluster_22:0.012231186314776014 - cluster/prob_snapshot/cluster_23:0.017001348977538657 - cluster/prob_snapshot/cluster_24:0.019116624728446982 - cluster/prob_snapshot/cluster_25:0.009662637188673051 - cluster/prob_snapshot/cluster_26:0.012231186314776014 - cluster/prob_snapshot/cluster_27:0.01805898685299282 - cluster/prob_snapshot/cluster_28:0.01654807560234402 - cluster/prob_snapshot/cluster_29:0.012231186314776014 - cluster/prob_snapshot/cluster_30:0.012231186314776014 - cluster/prob_snapshot/cluster_31:0.01805898685299282 - cluster/prob_snapshot/cluster_32:0.010720275064127214 - cluster/prob_snapshot/cluster_33:0.012231186314776014 - cluster/prob_snapshot/cluster_34:0.013742097565424816 - cluster/prob_snapshot/cluster_35:0.019856971241264895 - cluster/prob_snapshot/cluster_36:0.01805898685299282 - cluster/prob_snapshot/cluster_37:0.014389630958560018 - cluster/prob_snapshot/cluster_38:0.014389630958560018 - cluster/prob_snapshot/cluster_39:0.017741695490356573 - cluster/prob_snapshot/cluster_40:0.03023110373268394 - cluster/prob_snapshot/cluster_41:0.01805898685299282 - cluster/prob_snapshot/cluster_42:0.012684459689970656 - cluster/prob_snapshot/cluster_43:0.014389630958560018 - cluster/prob_snapshot/cluster_44:0.013742097565424816 - cluster/prob_snapshot/cluster_45:0.015037164351695217 - cluster/prob_snapshot/cluster_46:0.019116624728446982 - cluster/prob_snapshot/cluster_47:0.014389630958560018 - cluster/prob_snapshot/cluster_48:0.014389630958560018 - cluster/prob_snapshot/cluster_49:0.019116624728446982 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013742097565424816 - cluster/prob_snapshot/cluster_52:0.012231186314776014 - cluster/prob_snapshot/cluster_53:0.009662637188673051 - cluster/prob_snapshot/cluster_54:0.012231186314776014 - cluster/prob_snapshot/cluster_55:0.012231186314776014 - cluster/prob_snapshot/cluster_56:0.01837627821562907 - cluster/prob_snapshot/cluster_57:0.013742097565424816 - cluster/prob_snapshot/cluster_58:0.01805898685299282 - cluster/prob_snapshot/cluster_59:0.01837627821562907 - cluster/prob_snapshot/cluster_60:0.012231186314776014 - cluster/prob_snapshot/cluster_61:0.012231186314776014 - cluster/prob_snapshot/cluster_62:0.019856971241264895 - cluster/prob_snapshot/cluster_63:0.01654807560234402
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 14:42:51,793:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  13%|█▎        | 104/800 [3:12:10<23:14:03, 120.18s/it]
[36m(TaskRunner pid=2823680)[0m step:104 - global_seqlen/min:317545 - global_seqlen/max:426883 - global_seqlen/minmax_diff:109338 - global_seqlen/balanced_min:377314 - global_seqlen/balanced_max:377409 - global_seqlen/mean:377365.0 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.2402552512028943) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009615967981517315 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.013805594258883502) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00035037307834104666) - actor/ppo_kl:np.float64(2.4544801221445002e-05) - actor/pg_clipfrac_lower:np.float64(1.1738191697867992e-06) - actor/grad_norm:np.float64(0.21700256690382957) - perf/mfu/actor:np.float64(0.23501308500366516) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.33610534667969) - actor/lr:np.float64(1e-06) - training/global_step:104 - training/epoch:0 - critic/score/mean:0.5618131756782532 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5534648895263672 - critic/rewards/max:1.0091115236282349 - critic/rewards/min:-0.04611749202013016 - critic/advantages/mean:-0.15637199580669403 - critic/advantages/max:2.474862813949585 - critic/advantages/min:-2.4748611450195312 - critic/returns/mean:-0.15637199580669403 - critic/returns/max:2.474862813949585 - critic/returns/min:-2.4748611450195312 - response_length/mean:1122.2197265625 - response_length/max:8192.0 - response_length/min:188.0 - response_length/clip_ratio:0.0013736264081671834 - response_length_non_aborted/mean:1122.2197265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:188.0 - response_length_non_aborted/clip_ratio:0.0013736264081671834 - response/aborted_ratio:0.0 - prompt_length/mean:243.64834594726562 - prompt_length/max:382.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.215010166168213e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5201602820307016) - timing_s/agent_loop/generate_sequences/max:np.float64(27.783207828179002) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.788929186856876) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.783207828179002) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:243 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.61092834174633 - timing_s/reward:0.00017505325376987457 - timing_s/old_log_prob:9.143909032456577 - timing_s/ref:19.557482303120196 - timing_s/adv:0.088412219658494 - timing_s/update_actor:18.759416131302714 - timing_s/update_weights:26.544701624661684 - timing_s/step:104.16448329109699 - timing_s/stop_profile:6.055459380149841e-05 - timing_per_token_ms/adv:8.891440823621212e-05 - timing_per_token_ms/update_actor:0.018865971136280427 - timing_per_token_ms/gen:0.036244551053820837 - timing_per_token_ms/ref:0.019668570388675435 - perf/total_num_tokens:1509460 - perf/time_per_step:104.16448329109699 - perf/throughput:3622.779934936361 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:377.0 - frontier/mean_score:2.2083457095238086 - frontier/mean_frontier_pct:0.15721099556371235 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0538999999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:1.763 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.637 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.6596999999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.8596999999999997 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.339899999999999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.9176456999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:2.5259299999999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.3629999999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.09 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.09 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.91 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.8319299999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.4659 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.91 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.7598999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.09 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.5540999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.91 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.6878699999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.7598999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.91 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:104.0 - cluster/prob_snapshot/cluster_0:0.02195064068345344 - cluster/prob_snapshot/cluster_1:0.015022377624813416 - cluster/prob_snapshot/cluster_2:0.00965313547852843 - cluster/prob_snapshot/cluster_3:0.0165318031277851 - cluster/prob_snapshot/cluster_4:0.012671986484471796 - cluster/prob_snapshot/cluster_5:0.014375480980682696 - cluster/prob_snapshot/cluster_6:0.011766331182688787 - cluster/prob_snapshot/cluster_7:0.01911723338216088 - cluster/prob_snapshot/cluster_8:0.01909782648283696 - cluster/prob_snapshot/cluster_9:0.010709733330608609 - cluster/prob_snapshot/cluster_10:0.027742521970570498 - cluster/prob_snapshot/cluster_11:0.01221915883358029 - cluster/prob_snapshot/cluster_12:0.016818593973349713 - cluster/prob_snapshot/cluster_13:0.018041228630756783 - cluster/prob_snapshot/cluster_14:0.020971280134360323 - cluster/prob_snapshot/cluster_15:0.018155729336767915 - cluster/prob_snapshot/cluster_16:0.018041228630756783 - cluster/prob_snapshot/cluster_17:0.010709733330608609 - cluster/prob_snapshot/cluster_18:0.019837444979293083 - cluster/prob_snapshot/cluster_19:0.023736441996019236 - cluster/prob_snapshot/cluster_20:0.0165318031277851 - cluster/prob_snapshot/cluster_21:0.01221915883358029 - cluster/prob_snapshot/cluster_22:0.010709733330608609 - cluster/prob_snapshot/cluster_23:0.016984630778676603 - cluster/prob_snapshot/cluster_24:0.01909782648283696 - cluster/prob_snapshot/cluster_25:0.00965313547852843 - cluster/prob_snapshot/cluster_26:0.01221915883358029 - cluster/prob_snapshot/cluster_27:0.018041228630756783 - cluster/prob_snapshot/cluster_28:0.0165318031277851 - cluster/prob_snapshot/cluster_29:0.01221915883358029 - cluster/prob_snapshot/cluster_30:0.015022377624813416 - cluster/prob_snapshot/cluster_31:0.018041228630756783 - cluster/prob_snapshot/cluster_32:0.010709733330608609 - cluster/prob_snapshot/cluster_33:0.015022377624813416 - cluster/prob_snapshot/cluster_34:0.013728584336551974 - cluster/prob_snapshot/cluster_35:0.02035517792681237 - cluster/prob_snapshot/cluster_36:0.018041228630756783 - cluster/prob_snapshot/cluster_37:0.014375480980682696 - cluster/prob_snapshot/cluster_38:0.014375480980682696 - cluster/prob_snapshot/cluster_39:0.01772424927513273 - cluster/prob_snapshot/cluster_40:0.02760992972174517 - cluster/prob_snapshot/cluster_41:0.018041228630756783 - cluster/prob_snapshot/cluster_42:0.012671986484471796 - cluster/prob_snapshot/cluster_43:0.014375480980682696 - cluster/prob_snapshot/cluster_44:0.013728584336551974 - cluster/prob_snapshot/cluster_45:0.015022377624813416 - cluster/prob_snapshot/cluster_46:0.019837444979293083 - cluster/prob_snapshot/cluster_47:0.014375480980682696 - cluster/prob_snapshot/cluster_48:0.014375480980682696 - cluster/prob_snapshot/cluster_49:0.019837444979293083 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013728584336551974 - cluster/prob_snapshot/cluster_52:0.01221915883358029 - cluster/prob_snapshot/cluster_53:0.00965313547852843 - cluster/prob_snapshot/cluster_54:0.01221915883358029 - cluster/prob_snapshot/cluster_55:0.015022377624813416 - cluster/prob_snapshot/cluster_56:0.018358207986380833 - cluster/prob_snapshot/cluster_57:0.013728584336551974 - cluster/prob_snapshot/cluster_58:0.01909782648283696 - cluster/prob_snapshot/cluster_59:0.019319712031773795 - cluster/prob_snapshot/cluster_60:0.01221915883358029 - cluster/prob_snapshot/cluster_61:0.01221915883358029 - cluster/prob_snapshot/cluster_62:0.019837444979293083 - cluster/prob_snapshot/cluster_63:0.013728584336551974
[36m(TaskRunner pid=2823680)[0m Training Progress:  13%|█▎        | 105/800 [3:13:48<21:54:04, 113.45s/it]
[36m(TaskRunner pid=2823680)[0m step:105 - global_seqlen/min:342512 - global_seqlen/max:425124 - global_seqlen/minmax_diff:82612 - global_seqlen/balanced_min:372559 - global_seqlen/balanced_max:372667 - global_seqlen/mean:372629.25 - frontier/skipped_zero_acc_count:25.0 - actor/entropy:np.float64(0.2730772315094677) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010363882407546043 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.022378465568181127) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003159876162569092) - actor/ppo_kl:np.float64(2.1618030499868614e-05) - actor/pg_clipfrac_lower:np.float64(2.4419404536064786e-07) - actor/grad_norm:np.float64(0.22418465522619394) - perf/mfu/actor:np.float64(0.1840041998054908) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.33821487426758) - actor/lr:np.float64(1e-06) - training/global_step:105 - training/epoch:0 - critic/score/mean:0.541262149810791 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5325576066970825 - critic/rewards/max:1.0077731609344482 - critic/rewards/min:-0.044378671795129776 - critic/advantages/mean:-0.09344906359910965 - critic/advantages/max:2.4748198986053467 - critic/advantages/min:-2.4748592376708984 - critic/returns/mean:-0.09344906359910965 - critic/returns/max:2.4748198986053467 - critic/returns/min:-2.4748592376708984 - response_length/mean:1150.718505859375 - response_length/max:8192.0 - response_length/min:113.0 - response_length/clip_ratio:0.006067960988730192 - response_length_non_aborted/mean:1150.718505859375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:113.0 - response_length_non_aborted/clip_ratio:0.006067960988730192 - response/aborted_ratio:0.0 - prompt_length/mean:241.5631103515625 - prompt_length/max:350.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.497945964336395e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0112556871026754) - timing_s/agent_loop/generate_sequences/max:np.float64(27.208749034442008) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.558715581291835) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.208749034442008) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:294 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.832322941161692 - timing_s/reward:0.00012877397239208221 - timing_s/old_log_prob:10.291582213714719 - timing_s/ref:12.274182537570596 - timing_s/adv:0.12485390808433294 - timing_s/update_actor:23.744977643713355 - timing_s/update_weights:21.813141676597297 - timing_s/step:97.49679122492671 - timing_s/stop_profile:7.009413093328476e-05 - timing_per_token_ms/adv:0.00010882980726293796 - timing_per_token_ms/update_actor:0.02069748060014762 - timing_per_token_ms/gen:0.030407684246610066 - timing_per_token_ms/ref:0.010698879517424946 - perf/total_num_tokens:1490517 - perf/time_per_step:97.49679122492671 - perf/throughput:3821.9642443445973 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:402.0 - frontier/mean_score:2.2559272285714282 - frontier/mean_frontier_pct:0.17106049399951906 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0377299999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.1340999999999997 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.637 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.6596999999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.8596999999999997 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.5379299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.9176456999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:2.668150999999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.211645699999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.3629999999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.6569999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.09 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.09 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.91 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.8319299999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.6569999999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.62613 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.91 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.237 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.09 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.5540999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.91 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.7598999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.6878699999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.7598999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.91 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:105.0 - cluster/prob_snapshot/cluster_0:0.021373888260779864 - cluster/prob_snapshot/cluster_1:0.014705528952550069 - cluster/prob_snapshot/cluster_2:0.009449533676208011 - cluster/prob_snapshot/cluster_3:0.021882389972454886 - cluster/prob_snapshot/cluster_4:0.015015822649587129 - cluster/prob_snapshot/cluster_5:0.014072276509617291 - cluster/prob_snapshot/cluster_6:0.011518158323121754 - cluster/prob_snapshot/cluster_7:0.018714016916314552 - cluster/prob_snapshot/cluster_8:0.01869501934302657 - cluster/prob_snapshot/cluster_9:0.010483845999664882 - cluster/prob_snapshot/cluster_10:0.02715738282208493 - cluster/prob_snapshot/cluster_11:0.011961435033174697 - cluster/prob_snapshot/cluster_12:0.0178572263610265 - cluster/prob_snapshot/cluster_13:0.01869501934302657 - cluster/prob_snapshot/cluster_14:0.020528958523747948 - cluster/prob_snapshot/cluster_15:0.018773479320705937 - cluster/prob_snapshot/cluster_16:0.017660707019569698 - cluster/prob_snapshot/cluster_17:0.010483845999664882 - cluster/prob_snapshot/cluster_18:0.019419037969446377 - cluster/prob_snapshot/cluster_19:0.022597583170661687 - cluster/prob_snapshot/cluster_20:0.016183117986059883 - cluster/prob_snapshot/cluster_21:0.014705528952550069 - cluster/prob_snapshot/cluster_22:0.010483845999664882 - cluster/prob_snapshot/cluster_23:0.016626394696112826 - cluster/prob_snapshot/cluster_24:0.01869501934302657 - cluster/prob_snapshot/cluster_25:0.009449533676208011 - cluster/prob_snapshot/cluster_26:0.011961435033174697 - cluster/prob_snapshot/cluster_27:0.01869501934302657 - cluster/prob_snapshot/cluster_28:0.016183117986059883 - cluster/prob_snapshot/cluster_29:0.011961435033174697 - cluster/prob_snapshot/cluster_30:0.014705528952550069 - cluster/prob_snapshot/cluster_31:0.017660707019569698 - cluster/prob_snapshot/cluster_32:0.010483845999664882 - cluster/prob_snapshot/cluster_33:0.014705528952550069 - cluster/prob_snapshot/cluster_34:0.013439024066684513 - cluster/prob_snapshot/cluster_35:0.019925851007940244 - cluster/prob_snapshot/cluster_36:0.01869501934302657 - cluster/prob_snapshot/cluster_37:0.014072276509617291 - cluster/prob_snapshot/cluster_38:0.014072276509617291 - cluster/prob_snapshot/cluster_39:0.018477813755100627 - cluster/prob_snapshot/cluster_40:0.02702758717969847 - cluster/prob_snapshot/cluster_41:0.017660707019569698 - cluster/prob_snapshot/cluster_42:0.012404711743227641 - cluster/prob_snapshot/cluster_43:0.014072276509617291 - cluster/prob_snapshot/cluster_44:0.013439024066684513 - cluster/prob_snapshot/cluster_45:0.014705528952550069 - cluster/prob_snapshot/cluster_46:0.019419037969446377 - cluster/prob_snapshot/cluster_47:0.014072276509617291 - cluster/prob_snapshot/cluster_48:0.016183117986059883 - cluster/prob_snapshot/cluster_49:0.019925851007940244 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01573984127600694 - cluster/prob_snapshot/cluster_52:0.011961435033174697 - cluster/prob_snapshot/cluster_53:0.009449533676208011 - cluster/prob_snapshot/cluster_54:0.011961435033174697 - cluster/prob_snapshot/cluster_55:0.014705528952550069 - cluster/prob_snapshot/cluster_56:0.01797100071660676 - cluster/prob_snapshot/cluster_57:0.013439024066684513 - cluster/prob_snapshot/cluster_58:0.019419037969446377 - cluster/prob_snapshot/cluster_59:0.01891222493095251 - cluster/prob_snapshot/cluster_60:0.010483845999664882 - cluster/prob_snapshot/cluster_61:0.011961435033174697 - cluster/prob_snapshot/cluster_62:0.019419037969446377 - cluster/prob_snapshot/cluster_63:0.013439024066684513
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 14:46:14,102:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 14:46:19,068:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  13%|█▎        | 106/800 [3:15:42<21:54:46, 113.67s/it]
[36m(TaskRunner pid=2823680)[0m step:106 - global_seqlen/min:355407 - global_seqlen/max:433959 - global_seqlen/minmax_diff:78552 - global_seqlen/balanced_min:387813 - global_seqlen/balanced_max:387845 - global_seqlen/mean:387826.75 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.24650650040712208) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009112797677516937 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.013940359291154891) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00029869487366340763) - actor/ppo_kl:np.float64(4.66800889113254e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.25251371910174686) - perf/mfu/actor:np.float64(0.1959362818505791) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.40225601196289) - actor/lr:np.float64(1e-06) - training/global_step:106 - training/epoch:0 - critic/score/mean:0.5377604365348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5296843647956848 - critic/rewards/max:1.0081992149353027 - critic/rewards/min:-0.055667754262685776 - critic/advantages/mean:-0.10816677659749985 - critic/advantages/max:2.474848985671997 - critic/advantages/min:-2.4748616218566895 - critic/returns/mean:-0.10816677659749985 - critic/returns/max:2.474848985671997 - critic/returns/min:-2.4748616218566895 - response_length/mean:1184.8072509765625 - response_length/max:8192.0 - response_length/min:215.0 - response_length/clip_ratio:0.010416666977107525 - response_length_non_aborted/mean:1184.8072509765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:215.0 - response_length_non_aborted/clip_ratio:0.010416666977107525 - response/aborted_ratio:0.0 - prompt_length/mean:236.875 - prompt_length/max:401.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.139440953731537e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7461485154926777) - timing_s/agent_loop/generate_sequences/max:np.float64(28.43703083600849) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.842027755868912) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.43703083600849) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:194 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.474323489703238 - timing_s/reward:0.00014826375991106033 - timing_s/old_log_prob:11.265028483234346 - timing_s/ref:21.238490124233067 - timing_s/adv:0.07371799554675817 - timing_s/update_actor:23.16944050975144 - timing_s/update_weights:27.361214806325734 - timing_s/step:113.97304823063314 - timing_s/stop_profile:5.1966868340969086e-05 - timing_per_token_ms/adv:6.751647251345253e-05 - timing_per_token_ms/update_actor:0.02122031237727406 - timing_per_token_ms/gen:0.03349077017810478 - timing_per_token_ms/ref:0.019451803105396213 - perf/total_num_tokens:1551307 - perf/time_per_step:113.97304823063314 - perf/throughput:3402.793520229476 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:434.0 - frontier/mean_score:2.285076985714285 - frontier/mean_frontier_pct:0.18141862866014904 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0377299999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.1340999999999997 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.637 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.7617899999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.201789999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.5379299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.9176456999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.6569999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.211645699999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.3629999999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.09 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.6569999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.3629999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.09 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.8319299999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.6569999999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.7382909999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.91 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.237 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.09 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.0878699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.237 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.8319299999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.7815089999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.7598999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.91 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:106.0 - cluster/prob_snapshot/cluster_0:0.021101230640973002 - cluster/prob_snapshot/cluster_1:0.014517936761869415 - cluster/prob_snapshot/cluster_2:0.00932898998621561 - cluster/prob_snapshot/cluster_3:0.021603245612159754 - cluster/prob_snapshot/cluster_4:0.014824272173926085 - cluster/prob_snapshot/cluster_5:0.01389276245154968 - cluster/prob_snapshot/cluster_6:0.011371226066593415 - cluster/prob_snapshot/cluster_7:0.019184446205532692 - cluster/prob_snapshot/cluster_8:0.01845653491688375 - cluster/prob_snapshot/cluster_9:0.010350108026404512 - cluster/prob_snapshot/cluster_10:0.02918723517064846 - cluster/prob_snapshot/cluster_11:0.01180884808381723 - cluster/prob_snapshot/cluster_12:0.017629429304330736 - cluster/prob_snapshot/cluster_13:0.01845653491688375 - cluster/prob_snapshot/cluster_14:0.02026707931394269 - cluster/prob_snapshot/cluster_15:0.019225538912950008 - cluster/prob_snapshot/cluster_16:0.01845653491688375 - cluster/prob_snapshot/cluster_17:0.010350108026404512 - cluster/prob_snapshot/cluster_18:0.01917131754501598 - cluster/prob_snapshot/cluster_19:0.02230931539432049 - cluster/prob_snapshot/cluster_20:0.013267588141229945 - cluster/prob_snapshot/cluster_21:0.014517936761869415 - cluster/prob_snapshot/cluster_22:0.010350108026404512 - cluster/prob_snapshot/cluster_23:0.016414298836505945 - cluster/prob_snapshot/cluster_24:0.01845653491688375 - cluster/prob_snapshot/cluster_25:0.00861420735808338 - cluster/prob_snapshot/cluster_26:0.014517936761869415 - cluster/prob_snapshot/cluster_27:0.01845653491688375 - cluster/prob_snapshot/cluster_28:0.015976676819282134 - cluster/prob_snapshot/cluster_29:0.01180884808381723 - cluster/prob_snapshot/cluster_30:0.016414298836505945 - cluster/prob_snapshot/cluster_31:0.017435416876694847 - cluster/prob_snapshot/cluster_32:0.010350108026404512 - cluster/prob_snapshot/cluster_33:0.014517936761869415 - cluster/prob_snapshot/cluster_34:0.015539054802058319 - cluster/prob_snapshot/cluster_35:0.01967166538470854 - cluster/prob_snapshot/cluster_36:0.01845653491688375 - cluster/prob_snapshot/cluster_37:0.01389276245154968 - cluster/prob_snapshot/cluster_38:0.015976676819282134 - cluster/prob_snapshot/cluster_39:0.01902121319310821 - cluster/prob_snapshot/cluster_40:0.02668280772265128 - cluster/prob_snapshot/cluster_41:0.017435416876694847 - cluster/prob_snapshot/cluster_42:0.012246470101041044 - cluster/prob_snapshot/cluster_43:0.01389276245154968 - cluster/prob_snapshot/cluster_44:0.013267588141229945 - cluster/prob_snapshot/cluster_45:0.014517936761869415 - cluster/prob_snapshot/cluster_46:0.01917131754501598 - cluster/prob_snapshot/cluster_47:0.01389276245154968 - cluster/prob_snapshot/cluster_48:0.017435416876694847 - cluster/prob_snapshot/cluster_49:0.01967166538470854 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.015539054802058319 - cluster/prob_snapshot/cluster_52:0.01180884808381723 - cluster/prob_snapshot/cluster_53:0.00932898998621561 - cluster/prob_snapshot/cluster_54:0.01180884808381723 - cluster/prob_snapshot/cluster_55:0.014517936761869415 - cluster/prob_snapshot/cluster_56:0.014503140969858514 - cluster/prob_snapshot/cluster_57:0.015539054802058319 - cluster/prob_snapshot/cluster_58:0.01967166538470854 - cluster/prob_snapshot/cluster_59:0.019321421896923745 - cluster/prob_snapshot/cluster_60:0.010350108026404512 - cluster/prob_snapshot/cluster_61:0.01180884808381723 - cluster/prob_snapshot/cluster_62:0.01917131754501598 - cluster/prob_snapshot/cluster_63:0.013267588141229945
[36m(TaskRunner pid=2823680)[0m Training Progress:  13%|█▎        | 107/800 [3:17:33<21:42:08, 112.74s/it]
[36m(TaskRunner pid=2823680)[0m step:107 - global_seqlen/min:345543 - global_seqlen/max:430508 - global_seqlen/minmax_diff:84965 - global_seqlen/balanced_min:381797 - global_seqlen/balanced_max:381956 - global_seqlen/mean:381899.25 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.2496325405728486) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010425683110952377 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07206687238067389) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00038616403576775663) - actor/ppo_kl:np.float64(0.00011792172126598011) - actor/pg_clipfrac_lower:np.float64(6.228899615558071e-07) - actor/grad_norm:np.float64(0.22291401587426662) - perf/mfu/actor:np.float64(0.2225756654761296) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.18570709228516) - actor/lr:np.float64(1e-06) - training/global_step:107 - training/epoch:0 - critic/score/mean:0.5 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4914911389350891 - critic/rewards/max:1.0137653350830078 - critic/rewards/min:-0.04551386088132858 - critic/advantages/mean:-0.14859117567539215 - critic/advantages/max:2.4748454093933105 - critic/advantages/min:-2.4748523235321045 - critic/returns/mean:-0.14859117567539215 - critic/returns/max:2.4748454093933105 - critic/returns/min:-2.4748523235321045 - response_length/mean:1216.6763916015625 - response_length/max:8192.0 - response_length/min:140.0 - response_length/clip_ratio:0.013888888992369175 - response_length_non_aborted/mean:1216.6763916015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:140.0 - response_length_non_aborted/clip_ratio:0.013888888992369175 - response/aborted_ratio:0.0 - prompt_length/mean:237.1777801513672 - prompt_length/max:404.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.002536535263062e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1997106363996863) - timing_s/agent_loop/generate_sequences/max:np.float64(28.49738644901663) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.831454501768349) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.49738644901663) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.91309495549649 - timing_s/reward:0.00011545605957508087 - timing_s/old_log_prob:9.68785518500954 - timing_s/ref:21.242029814980924 - timing_s/adv:0.07639643736183643 - timing_s/update_actor:20.1193987717852 - timing_s/update_weights:28.931965801864862 - timing_s/step:110.34151971805841 - timing_s/stop_profile:5.349237471818924e-05 - timing_per_token_ms/adv:7.298267283975681e-05 - timing_per_token_ms/update_actor:0.019220366145337058 - timing_per_token_ms/gen:0.03414709580573727 - timing_per_token_ms/ref:0.02029283257145129 - perf/total_num_tokens:1527597 - perf/time_per_step:110.34151971805841 - perf/throughput:3461.0657074129335 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:472.0 - frontier/mean_score:2.2887318714285714 - frontier/mean_frontier_pct:0.20683978027597408 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0377299999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.3938699999999997 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.7 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.637 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.7617899999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.201789999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.5379299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.9176456999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.1598999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:1.49 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.211645699999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.3629999999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.09 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.6569999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.5540999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.3629999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.8659000000000001 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.8319299999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.6569999999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.51 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8168036999999995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.188877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.91 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.237 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.09 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.0878699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.4659 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.8823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.7815089999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.8319299999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.91 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:107.0 - cluster/prob_snapshot/cluster_0:0.02106753399551343 - cluster/prob_snapshot/cluster_1:0.014494753006561832 - cluster/prob_snapshot/cluster_2:0.008600451293510685 - cluster/prob_snapshot/cluster_3:0.0215687472968456 - cluster/prob_snapshot/cluster_4:0.016602179129099604 - cluster/prob_snapshot/cluster_5:0.011789990483806276 - cluster/prob_snapshot/cluster_6:0.011353067307053455 - cluster/prob_snapshot/cluster_7:0.019153810481336075 - cluster/prob_snapshot/cluster_8:0.01842706159733722 - cluster/prob_snapshot/cluster_9:0.010333579894630207 - cluster/prob_snapshot/cluster_10:0.029140625949971976 - cluster/prob_snapshot/cluster_11:0.011789990483806276 - cluster/prob_snapshot/cluster_12:0.017601276793274386 - cluster/prob_snapshot/cluster_13:0.01842706159733722 - cluster/prob_snapshot/cluster_14:0.02023471472830488 - cluster/prob_snapshot/cluster_15:0.019194837567633162 - cluster/prob_snapshot/cluster_16:0.014979529674101865 - cluster/prob_snapshot/cluster_17:0.010333579894630207 - cluster/prob_snapshot/cluster_18:0.01914070278603349 - cluster/prob_snapshot/cluster_19:0.022273689553151375 - cluster/prob_snapshot/cluster_20:0.013246401072982346 - cluster/prob_snapshot/cluster_21:0.014494753006561832 - cluster/prob_snapshot/cluster_22:0.010333579894630207 - cluster/prob_snapshot/cluster_23:0.01638808677249072 - cluster/prob_snapshot/cluster_24:0.01842706159733722 - cluster/prob_snapshot/cluster_25:0.008600451293510685 - cluster/prob_snapshot/cluster_26:0.014494753006561832 - cluster/prob_snapshot/cluster_27:0.01842706159733722 - cluster/prob_snapshot/cluster_28:0.0159511635957379 - cluster/prob_snapshot/cluster_29:0.011789990483806276 - cluster/prob_snapshot/cluster_30:0.017713420408640945 - cluster/prob_snapshot/cluster_31:0.01740757418491397 - cluster/prob_snapshot/cluster_32:0.010333579894630207 - cluster/prob_snapshot/cluster_33:0.01638808677249072 - cluster/prob_snapshot/cluster_34:0.012940554849255373 - cluster/prob_snapshot/cluster_35:0.019640251618120882 - cluster/prob_snapshot/cluster_36:0.01842706159733722 - cluster/prob_snapshot/cluster_37:0.011789990483806276 - cluster/prob_snapshot/cluster_38:0.01740757418491397 - cluster/prob_snapshot/cluster_39:0.01953534636338253 - cluster/prob_snapshot/cluster_40:0.02905107126284354 - cluster/prob_snapshot/cluster_41:0.01740757418491397 - cluster/prob_snapshot/cluster_42:0.012226913660559096 - cluster/prob_snapshot/cluster_43:0.01387057703977209 - cluster/prob_snapshot/cluster_44:0.013246401072982346 - cluster/prob_snapshot/cluster_45:0.014494753006561832 - cluster/prob_snapshot/cluster_46:0.01914070278603349 - cluster/prob_snapshot/cluster_47:0.0159511635957379 - cluster/prob_snapshot/cluster_48:0.01740757418491397 - cluster/prob_snapshot/cluster_49:0.019640251618120882 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.015514240418985083 - cluster/prob_snapshot/cluster_52:0.011789990483806276 - cluster/prob_snapshot/cluster_53:0.009314092482206958 - cluster/prob_snapshot/cluster_54:0.011789990483806276 - cluster/prob_snapshot/cluster_55:0.014494753006561832 - cluster/prob_snapshot/cluster_56:0.014479980842014474 - cluster/prob_snapshot/cluster_57:0.017101727961187 - cluster/prob_snapshot/cluster_58:0.019989935800582056 - cluster/prob_snapshot/cluster_59:0.019290567435659708 - cluster/prob_snapshot/cluster_60:0.010333579894630207 - cluster/prob_snapshot/cluster_61:0.010333579894630207 - cluster/prob_snapshot/cluster_62:0.019640251618120882 - cluster/prob_snapshot/cluster_63:0.013246401072982346
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▎        | 108/800 [3:19:23<21:30:40, 111.91s/it]
[36m(TaskRunner pid=2823680)[0m step:108 - global_seqlen/min:346091 - global_seqlen/max:422036 - global_seqlen/minmax_diff:75945 - global_seqlen/balanced_min:379699 - global_seqlen/balanced_max:379862 - global_seqlen/mean:379796.5 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.25880346480343075) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009949034079909325 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.006988967843426508) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00033134221096891756) - actor/ppo_kl:np.float64(1.1934659102684641e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2120296210050583) - perf/mfu/actor:np.float64(0.21978291995005456) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.60586166381836) - actor/lr:np.float64(1e-06) - training/global_step:108 - training/epoch:0 - critic/score/mean:0.5666666626930237 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5582435131072998 - critic/rewards/max:1.0082818269729614 - critic/rewards/min:-0.05853094533085823 - critic/advantages/mean:-0.18899381160736084 - critic/advantages/max:2.474822759628296 - critic/advantages/min:-2.474858283996582 - critic/returns/mean:-0.18899381160736084 - critic/returns/max:2.474822759628296 - critic/returns/min:-2.474858283996582 - response_length/mean:1147.5638427734375 - response_length/max:8192.0 - response_length/min:170.0 - response_length/clip_ratio:0.009722222574055195 - response_length_non_aborted/mean:1147.5638427734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:170.0 - response_length_non_aborted/clip_ratio:0.009722222574055195 - response/aborted_ratio:0.0 - prompt_length/mean:239.8111114501953 - prompt_length/max:375.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00013204850256443024 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.328260624781251) - timing_s/agent_loop/generate_sequences/max:np.float64(29.861787693575025) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.681781288118145) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.861787693575025) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:220 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.866796107962728 - timing_s/reward:0.00013165269047021866 - timing_s/old_log_prob:9.62405049148947 - timing_s/ref:19.615927283652127 - timing_s/adv:0.08070848044008017 - timing_s/update_actor:20.274751506745815 - timing_s/update_weights:27.894757733680308 - timing_s/step:109.73975694831461 - timing_s/stop_profile:5.8175064623355865e-05 - timing_per_token_ms/adv:8.079654867813934e-05 - timing_per_token_ms/update_actor:0.020296875100605474 - timing_per_token_ms/gen:0.03856816990092869 - timing_per_token_ms/ref:0.019637331975505427 - perf/total_num_tokens:1519186 - perf/time_per_step:109.73975694831461 - perf/throughput:3460.883371364465 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:510.0 - frontier/mean_score:2.302989570476191 - frontier/mean_frontier_pct:0.21858481054176554 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0377299999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.575709 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.7 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.637 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.833252999999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.201789999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.5379299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9423519899999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.1598999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.1481519899999992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.3629999999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.09 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.5540999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.3629999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.8659000000000001 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.8319299999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.6569999999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.6569999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.8717625899999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.432213969999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.237 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.237 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.09 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.09 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.0878699999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.0261299999999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9176456999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.7815089999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.8319299999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.237 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:108.0 - cluster/prob_snapshot/cluster_0:0.020937105893174515 - cluster/prob_snapshot/cluster_1:0.014405016679143552 - cluster/prob_snapshot/cluster_2:0.00854720630804111 - cluster/prob_snapshot/cluster_3:0.02143521620676385 - cluster/prob_snapshot/cluster_4:0.017752694308909165 - cluster/prob_snapshot/cluster_5:0.011716999212700497 - cluster/prob_snapshot/cluster_6:0.011282781006582773 - cluster/prob_snapshot/cluster_7:0.019527778335518418 - cluster/prob_snapshot/cluster_8:0.01831298053420307 - cluster/prob_snapshot/cluster_9:0.009256429378033393 - cluster/prob_snapshot/cluster_10:0.028960217718784008 - cluster/prob_snapshot/cluster_11:0.011716999212700497 - cluster/prob_snapshot/cluster_12:0.017492308124640567 - cluster/prob_snapshot/cluster_13:0.01831298053420307 - cluster/prob_snapshot/cluster_14:0.020279727029598668 - cluster/prob_snapshot/cluster_15:0.019076003239933335 - cluster/prob_snapshot/cluster_16:0.014886792117359882 - cluster/prob_snapshot/cluster_17:0.009256429378033393 - cluster/prob_snapshot/cluster_18:0.01902220360419535 - cluster/prob_snapshot/cluster_19:0.021698173169583234 - cluster/prob_snapshot/cluster_20:0.013164393233092911 - cluster/prob_snapshot/cluster_21:0.014405016679143552 - cluster/prob_snapshot/cluster_22:0.010269605192308084 - cluster/prob_snapshot/cluster_23:0.01628662890565369 - cluster/prob_snapshot/cluster_24:0.01831298053420307 - cluster/prob_snapshot/cluster_25:0.00854720630804111 - cluster/prob_snapshot/cluster_26:0.014405016679143552 - cluster/prob_snapshot/cluster_27:0.014886792117359882 - cluster/prob_snapshot/cluster_28:0.015852410699535965 - cluster/prob_snapshot/cluster_29:0.011716999212700497 - cluster/prob_snapshot/cluster_30:0.017603757464210786 - cluster/prob_snapshot/cluster_31:0.01729980471992838 - cluster/prob_snapshot/cluster_32:0.010269605192308084 - cluster/prob_snapshot/cluster_33:0.01628662890565369 - cluster/prob_snapshot/cluster_34:0.012860440488810505 - cluster/prob_snapshot/cluster_35:0.01951865975318995 - cluster/prob_snapshot/cluster_36:0.01831298053420307 - cluster/prob_snapshot/cluster_37:0.014405016679143552 - cluster/prob_snapshot/cluster_38:0.01831298053420307 - cluster/prob_snapshot/cluster_39:0.019793200003583963 - cluster/prob_snapshot/cluster_40:0.030548380939417728 - cluster/prob_snapshot/cluster_41:0.01729980471992838 - cluster/prob_snapshot/cluster_42:0.01215121741881822 - cluster/prob_snapshot/cluster_43:0.013784704956118232 - cluster/prob_snapshot/cluster_44:0.015418192493418243 - cluster/prob_snapshot/cluster_45:0.014405016679143552 - cluster/prob_snapshot/cluster_46:0.01902220360419535 - cluster/prob_snapshot/cluster_47:0.015852410699535965 - cluster/prob_snapshot/cluster_48:0.01729980471992838 - cluster/prob_snapshot/cluster_49:0.01951865975318995 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.015418192493418243 - cluster/prob_snapshot/cluster_52:0.014405016679143552 - cluster/prob_snapshot/cluster_53:0.009256429378033393 - cluster/prob_snapshot/cluster_54:0.011716999212700497 - cluster/prob_snapshot/cluster_55:0.014405016679143552 - cluster/prob_snapshot/cluster_56:0.014390335968365284 - cluster/prob_snapshot/cluster_57:0.013964802126369915 - cluster/prob_snapshot/cluster_58:0.020109442570493522 - cluster/prob_snapshot/cluster_59:0.01917114044889373 - cluster/prob_snapshot/cluster_60:0.010269605192308084 - cluster/prob_snapshot/cluster_61:0.010269605192308084 - cluster/prob_snapshot/cluster_62:0.01951865975318995 - cluster/prob_snapshot/cluster_63:0.015418192493418243
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▎        | 109/800 [3:21:12<21:20:05, 111.15s/it]
[36m(TaskRunner pid=2823680)[0m step:109 - global_seqlen/min:334276 - global_seqlen/max:427876 - global_seqlen/minmax_diff:93600 - global_seqlen/balanced_min:375841 - global_seqlen/balanced_max:376197 - global_seqlen/mean:375938.0 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.23721720070085106) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009696494787931442 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05612550399382599) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002689446773015958) - actor/ppo_kl:np.float64(4.309151654106646e-05) - actor/pg_clipfrac_lower:np.float64(1.0648361471705322e-06) - actor/grad_norm:np.float64(0.21112413360522345) - perf/mfu/actor:np.float64(0.19759661705214387) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.82537460327148) - actor/lr:np.float64(1e-06) - training/global_step:109 - training/epoch:0 - critic/score/mean:0.4752475321292877 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4661685824394226 - critic/rewards/max:1.0020414590835571 - critic/rewards/min:-0.08014538139104843 - critic/advantages/mean:-0.14534997940063477 - critic/advantages/max:2.474841594696045 - critic/advantages/min:-2.474846601486206 - critic/returns/mean:-0.14534997940063477 - critic/returns/max:2.474841594696045 - critic/returns/min:-2.474846601486206 - response_length/mean:1215.9517822265625 - response_length/max:8192.0 - response_length/min:197.0 - response_length/clip_ratio:0.011138614267110825 - response_length_non_aborted/mean:1215.9517822265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:197.0 - response_length_non_aborted/clip_ratio:0.011138614267110825 - response/aborted_ratio:0.0 - prompt_length/mean:237.99009704589844 - prompt_length/max:461.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.509494364261627e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5343779623508453) - timing_s/agent_loop/generate_sequences/max:np.float64(27.572858803905547) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.782149785918591) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.572858803905547) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:229 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.178561273030937 - timing_s/reward:0.00023862812668085098 - timing_s/old_log_prob:10.413464656099677 - timing_s/ref:19.960034514777362 - timing_s/adv:0.09918787144124508 - timing_s/update_actor:22.26929388847202 - timing_s/update_weights:26.885433418676257 - timing_s/step:109.19422986824065 - timing_s/stop_profile:6.293505430221558e-05 - timing_per_token_ms/adv:8.443065875138436e-05 - timing_per_token_ms/update_actor:0.01895605909887513 - timing_per_token_ms/gen:0.029698613697487643 - timing_per_token_ms/ref:0.01699037229346422 - perf/total_num_tokens:1503752 - perf/time_per_step:109.19422986824065 - perf/throughput:3442.8375973128436 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:537.0 - frontier/mean_score:2.3295774850322584 - frontier/mean_frontier_pct:0.2302582662348101 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0377299999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.575709 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.09 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.637 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.883277099999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.201789999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.5379299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9423519899999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.1598999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.1481519899999992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.3629999999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.763 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.6878699999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.3629999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.8659000000000001 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:2.8823509999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:2.7598999999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.6569999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9102338129999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.002549778999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.237 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.91 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.6569999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:1.8659000000000001 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.09 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.3629999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.361509 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.0261299999999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9176456999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.7815089999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.8319299999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.237 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:109.0 - cluster/prob_snapshot/cluster_0:0.021031987764344253 - cluster/prob_snapshot/cluster_1:0.014470296710859587 - cluster/prob_snapshot/cluster_2:0.008585940168008122 - cluster/prob_snapshot/cluster_3:0.021532355392714506 - cluster/prob_snapshot/cluster_4:0.017833145201354754 - cluster/prob_snapshot/cluster_5:0.014470296710859587 - cluster/prob_snapshot/cluster_6:0.011333911825682845 - cluster/prob_snapshot/cluster_7:0.019962619682596536 - cluster/prob_snapshot/cluster_8:0.01839597050753776 - cluster/prob_snapshot/cluster_9:0.00929837726444231 - cluster/prob_snapshot/cluster_10:0.029091458381207032 - cluster/prob_snapshot/cluster_11:0.011770097803091531 - cluster/prob_snapshot/cluster_12:0.01757157901023534 - cluster/prob_snapshot/cluster_13:0.01839597050753776 - cluster/prob_snapshot/cluster_14:0.020371629819659405 - cluster/prob_snapshot/cluster_15:0.01916245104657288 - cluster/prob_snapshot/cluster_16:0.014954255438174937 - cluster/prob_snapshot/cluster_17:0.00929837726444231 - cluster/prob_snapshot/cluster_18:0.019108407603971946 - cluster/prob_snapshot/cluster_19:0.02179650401252778 - cluster/prob_snapshot/cluster_20:0.013224051061120484 - cluster/prob_snapshot/cluster_21:0.014470296710859587 - cluster/prob_snapshot/cluster_22:0.010316144545062578 - cluster/prob_snapshot/cluster_23:0.016360435946297224 - cluster/prob_snapshot/cluster_24:0.01839597050753776 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.012206283780500217 - cluster/prob_snapshot/cluster_27:0.014954255438174937 - cluster/prob_snapshot/cluster_28:0.01592424996888854 - cluster/prob_snapshot/cluster_29:0.011770097803091531 - cluster/prob_snapshot/cluster_30:0.018609701636468017 - cluster/prob_snapshot/cluster_31:0.01839597050753776 - cluster/prob_snapshot/cluster_32:0.010316144545062578 - cluster/prob_snapshot/cluster_33:0.016360435946297224 - cluster/prob_snapshot/cluster_34:0.012918720876934405 - cluster/prob_snapshot/cluster_35:0.01995620774872863 - cluster/prob_snapshot/cluster_36:0.019108407603971946 - cluster/prob_snapshot/cluster_37:0.014470296710859587 - cluster/prob_snapshot/cluster_38:0.01839597050753776 - cluster/prob_snapshot/cluster_39:0.020149256828749403 - cluster/prob_snapshot/cluster_40:0.027712001388571987 - cluster/prob_snapshot/cluster_41:0.017378203226917494 - cluster/prob_snapshot/cluster_42:0.012206283780500217 - cluster/prob_snapshot/cluster_43:0.013847173885990036 - cluster/prob_snapshot/cluster_44:0.015488063991479856 - cluster/prob_snapshot/cluster_45:0.014470296710859587 - cluster/prob_snapshot/cluster_46:0.019108407603971946 - cluster/prob_snapshot/cluster_47:0.013224051061120484 - cluster/prob_snapshot/cluster_48:0.01839597050753776 - cluster/prob_snapshot/cluster_49:0.01960711357147588 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.012918720876934405 - cluster/prob_snapshot/cluster_52:0.014470296710859587 - cluster/prob_snapshot/cluster_53:0.012740092333805133 - cluster/prob_snapshot/cluster_54:0.011770097803091531 - cluster/prob_snapshot/cluster_55:0.016360435946297224 - cluster/prob_snapshot/cluster_56:0.01635011287816522 - cluster/prob_snapshot/cluster_57:0.014028087212810495 - cluster/prob_snapshot/cluster_58:0.020200573672805557 - cluster/prob_snapshot/cluster_59:0.019258019394223124 - cluster/prob_snapshot/cluster_60:0.010316144545062578 - cluster/prob_snapshot/cluster_61:0.010316144545062578 - cluster/prob_snapshot/cluster_62:0.01960711357147588 - cluster/prob_snapshot/cluster_63:0.015488063991479856
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▍        | 110/800 [3:23:01<21:10:02, 110.44s/it]
[36m(TaskRunner pid=2823680)[0m step:110 - global_seqlen/min:361066 - global_seqlen/max:409426 - global_seqlen/minmax_diff:48360 - global_seqlen/balanced_min:392265 - global_seqlen/balanced_max:392379 - global_seqlen/mean:392335.0 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.2597568984228102) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009825699962675571 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05139847188547719) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000349270989318029) - actor/ppo_kl:np.float64(6.924896440419947e-05) - actor/pg_clipfrac_lower:np.float64(1.444425643884725e-05) - actor/grad_norm:np.float64(0.21059616926041516) - perf/mfu/actor:np.float64(0.23830309211867348) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.21193313598633) - actor/lr:np.float64(1e-06) - training/global_step:110 - training/epoch:0 - critic/score/mean:0.5340909361839294 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5246562957763672 - critic/rewards/max:1.0005943775177002 - critic/rewards/min:-0.07011175155639648 - critic/advantages/mean:-0.12521637976169586 - critic/advantages/max:2.474860191345215 - critic/advantages/min:-2.4748587608337402 - critic/returns/mean:-0.12521637976169586 - critic/returns/max:2.474860191345215 - critic/returns/min:-2.4748587608337402 - response_length/mean:1219.8011474609375 - response_length/max:8192.0 - response_length/min:206.0 - response_length/clip_ratio:0.007102272938936949 - response_length_non_aborted/mean:1219.8011474609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:206.0 - response_length_non_aborted/clip_ratio:0.007102272938936949 - response/aborted_ratio:0.0 - prompt_length/mean:251.13636779785156 - prompt_length/max:502.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.647795766592026e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.9604426193982363) - timing_s/agent_loop/generate_sequences/max:np.float64(28.28673394396901) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.350251302294964) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.28673394396901) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:206 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.912412104196846 - timing_s/reward:0.00012410525232553482 - timing_s/old_log_prob:8.434486661106348 - timing_s/ref:21.313366101123393 - timing_s/adv:0.06450432259589434 - timing_s/update_actor:19.250049140304327 - timing_s/update_weights:29.16764901392162 - timing_s/step:108.49690678343177 - timing_s/stop_profile:5.972757935523987e-05 - timing_per_token_ms/adv:6.229051760037694e-05 - timing_per_token_ms/update_actor:0.01858938248672608 - timing_per_token_ms/gen:0.0348329088014962 - timing_per_token_ms/ref:0.02058188587705293 - perf/total_num_tokens:1569340 - perf/time_per_step:108.49690678343177 - perf/throughput:3616.093874299394 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:577.0 - frontier/mean_score:2.3222132863114755 - frontier/mean_frontier_pct:0.23005398247818654 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.0377299999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.763 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.09 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:2.0458999999999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.883277099999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.201789999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.0765509999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9596463929999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.1598999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.8319299999999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.1481519899999992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.5540999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.5340999999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:2.09 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.1815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.3629999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.8659000000000001 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:2.9176456999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:2.7598999999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.6569999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9102338129999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.002549778999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.237 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.91 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.7598999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.20613 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.09 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.5540999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.361509 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.0261299999999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.3423519899999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.7815089999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.8319299999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.237 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:110.0 - cluster/prob_snapshot/cluster_0:0.021444564438831193 - cluster/prob_snapshot/cluster_1:0.012445729905442352 - cluster/prob_snapshot/cluster_2:0.0087543673600335 - cluster/prob_snapshot/cluster_3:0.021954747592697513 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01475415513464238 - cluster/prob_snapshot/cluster_6:0.014442835401897053 - cluster/prob_snapshot/cluster_7:0.020354218961512814 - cluster/prob_snapshot/cluster_8:0.018756837412796556 - cluster/prob_snapshot/cluster_9:0.009480780069772592 - cluster/prob_snapshot/cluster_10:0.02966213469052105 - cluster/prob_snapshot/cluster_11:0.012000987430091889 - cluster/prob_snapshot/cluster_12:0.014659213205261609 - cluster/prob_snapshot/cluster_13:0.018756837412796556 - cluster/prob_snapshot/cluster_14:0.020893340682299876 - cluster/prob_snapshot/cluster_15:0.019538353715231566 - cluster/prob_snapshot/cluster_16:0.015247607500150273 - cluster/prob_snapshot/cluster_17:0.009480780069772592 - cluster/prob_snapshot/cluster_18:0.01999173901935301 - cluster/prob_snapshot/cluster_19:0.02222407791765221 - cluster/prob_snapshot/cluster_20:0.013483462347926769 - cluster/prob_snapshot/cluster_21:0.01475415513464238 - cluster/prob_snapshot/cluster_22:0.010518512512257008 - cluster/prob_snapshot/cluster_23:0.018030424703057463 - cluster/prob_snapshot/cluster_24:0.018756837412796556 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.010829832245002332 - cluster/prob_snapshot/cluster_27:0.015247607500150273 - cluster/prob_snapshot/cluster_28:0.01623663005247726 - cluster/prob_snapshot/cluster_29:0.01475415513464238 - cluster/prob_snapshot/cluster_30:0.015400154169195482 - cluster/prob_snapshot/cluster_31:0.018756837412796556 - cluster/prob_snapshot/cluster_32:0.010518512512257008 - cluster/prob_snapshot/cluster_33:0.016681372527827722 - cluster/prob_snapshot/cluster_34:0.013172142615181445 - cluster/prob_snapshot/cluster_35:0.020596840806565674 - cluster/prob_snapshot/cluster_36:0.019483250122535645 - cluster/prob_snapshot/cluster_37:0.01475415513464238 - cluster/prob_snapshot/cluster_38:0.018756837412796556 - cluster/prob_snapshot/cluster_39:0.020544517299083166 - cluster/prob_snapshot/cluster_40:0.028255617403585915 - cluster/prob_snapshot/cluster_41:0.018756837412796556 - cluster/prob_snapshot/cluster_42:0.012445729905442352 - cluster/prob_snapshot/cluster_43:0.014118808741284575 - cluster/prob_snapshot/cluster_44:0.0157918875771268 - cluster/prob_snapshot/cluster_45:0.01475415513464238 - cluster/prob_snapshot/cluster_46:0.019483250122535645 - cluster/prob_snapshot/cluster_47:0.013483462347926769 - cluster/prob_snapshot/cluster_48:0.019483250122535645 - cluster/prob_snapshot/cluster_49:0.01999173901935301 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.015573963764205069 - cluster/prob_snapshot/cluster_52:0.01475415513464238 - cluster/prob_snapshot/cluster_53:0.012990009982418872 - cluster/prob_snapshot/cluster_54:0.012000987430091889 - cluster/prob_snapshot/cluster_55:0.018030424703057463 - cluster/prob_snapshot/cluster_56:0.016670846955911096 - cluster/prob_snapshot/cluster_57:0.014303270977489456 - cluster/prob_snapshot/cluster_58:0.016535609875788657 - cluster/prob_snapshot/cluster_59:0.019635796791580853 - cluster/prob_snapshot/cluster_60:0.010518512512257008 - cluster/prob_snapshot/cluster_61:0.010518512512257008 - cluster/prob_snapshot/cluster_62:0.01999173901935301 - cluster/prob_snapshot/cluster_63:0.0157918875771268
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 14:55:24,439:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▍        | 111/800 [3:24:53<21:15:24, 111.07s/it]
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 14:55:25,920:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:111 - global_seqlen/min:267170 - global_seqlen/max:451077 - global_seqlen/minmax_diff:183907 - global_seqlen/balanced_min:372859 - global_seqlen/balanced_max:372989 - global_seqlen/mean:372934.75 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.22089319237891367) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010512938722968102 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.019745798341318732) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000370870623109844) - actor/ppo_kl:np.float64(-7.955154790592775e-07) - actor/pg_clipfrac_lower:np.float64(3.069480846855132e-06) - actor/grad_norm:np.float64(0.20352319799936736) - perf/mfu/actor:np.float64(0.1986322502842644) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.32824897766113) - actor/lr:np.float64(1e-06) - training/global_step:111 - training/epoch:0 - critic/score/mean:0.5804455280303955 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5712823867797852 - critic/rewards/max:1.0008994340896606 - critic/rewards/min:-0.04192651808261871 - critic/advantages/mean:-0.16177760064601898 - critic/advantages/max:2.474825382232666 - critic/advantages/min:-2.4748623371124268 - critic/returns/mean:-0.16177760064601898 - critic/returns/max:2.474825382232666 - critic/returns/min:-2.4748623371124268 - response_length/mean:1200.1126708984375 - response_length/max:8192.0 - response_length/min:223.0 - response_length/clip_ratio:0.008663366548717022 - response_length_non_aborted/mean:1200.1126708984375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:223.0 - response_length_non_aborted/clip_ratio:0.008663366548717022 - response/aborted_ratio:0.0 - prompt_length/mean:235.98019409179688 - prompt_length/max:502.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.507779031991959e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7299348199740052) - timing_s/agent_loop/generate_sequences/max:np.float64(27.98245166055858) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.751872323864518) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.98245166055858) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:215 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.625893187709153 - timing_s/reward:0.00012983940541744232 - timing_s/old_log_prob:9.802018735557795 - timing_s/ref:21.978584957309067 - timing_s/adv:0.08613620791584253 - timing_s/update_actor:21.975820469669998 - timing_s/update_weights:27.391530045308173 - timing_s/step:112.31449143681675 - timing_s/stop_profile:5.031749606132507e-05 - timing_per_token_ms/adv:7.423212211682252e-05 - timing_per_token_ms/update_actor:0.01893874629721044 - timing_per_token_ms/gen:0.03158314678357245 - timing_per_token_ms/ref:0.01894112873067227 - perf/total_num_tokens:1491739 - perf/time_per_step:112.31449143681675 - perf/throughput:3320.4508628327526 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:604.0 - frontier/mean_score:2.308291250035 - frontier/mean_frontier_pct:0.24871394139615016 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.6264109999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.763 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.6769999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.09 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:2.0458999999999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.9182939699999992 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.201789999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.49 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.0765509999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9596463929999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.1598999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:2.8823509999999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.1481519899999992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.5540999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.5340999999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.763 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.1815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.5540999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.8659000000000001 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:2.9423519899999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.2319299999999993 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.3629999999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.6569999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9371636690999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.763 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.237 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:1.91 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.7598999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.844291 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.3629999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.5540999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.5530562999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.0261299999999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.3423519899999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.7815089999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.8823509999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.237 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:111.0 - cluster/prob_snapshot/cluster_0:0.0261839502846172 - cluster/prob_snapshot/cluster_1:0.012729473948700278 - cluster/prob_snapshot/cluster_2:0.008953953853535573 - cluster/prob_snapshot/cluster_3:0.026549220481775902 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015090527823473386 - cluster/prob_snapshot/cluster_6:0.014772110466049855 - cluster/prob_snapshot/cluster_7:0.021071098732707943 - cluster/prob_snapshot/cluster_8:0.01918446527606162 - cluster/prob_snapshot/cluster_9:0.009696927687523808 - cluster/prob_snapshot/cluster_10:0.030338387035115898 - cluster/prob_snapshot/cluster_11:0.010758318878935573 - cluster/prob_snapshot/cluster_12:0.014993421359981567 - cluster/prob_snapshot/cluster_13:0.01918446527606162 - cluster/prob_snapshot/cluster_14:0.021369677627372796 - cluster/prob_snapshot/cluster_15:0.019983798982313814 - cluster/prob_snapshot/cluster_16:0.015595230165512038 - cluster/prob_snapshot/cluster_17:0.009696927687523808 - cluster/prob_snapshot/cluster_18:0.020811577972495852 - cluster/prob_snapshot/cluster_19:0.022730753683070863 - cluster/prob_snapshot/cluster_20:0.013790865140112043 - cluster/prob_snapshot/cluster_21:0.015090527823473386 - cluster/prob_snapshot/cluster_22:0.010758318878935573 - cluster/prob_snapshot/cluster_23:0.018441491442073382 - cluster/prob_snapshot/cluster_24:0.01918446527606162 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.011076736236359101 - cluster/prob_snapshot/cluster_27:0.015595230165512038 - cluster/prob_snapshot/cluster_28:0.01660680095406162 - cluster/prob_snapshot/cluster_29:0.012729473948700278 - cluster/prob_snapshot/cluster_30:0.015751254670649566 - cluster/prob_snapshot/cluster_31:0.01918446527606162 - cluster/prob_snapshot/cluster_32:0.010758318878935573 - cluster/prob_snapshot/cluster_33:0.018441491442073382 - cluster/prob_snapshot/cluster_34:0.013472447782688515 - cluster/prob_snapshot/cluster_35:0.021244806015094396 - cluster/prob_snapshot/cluster_36:0.0161153118493038 - cluster/prob_snapshot/cluster_37:0.01706168289323809 - cluster/prob_snapshot/cluster_38:0.01918446527606162 - cluster/prob_snapshot/cluster_39:0.021207344531410873 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.01918446527606162 - cluster/prob_snapshot/cluster_42:0.012729473948700278 - cluster/prob_snapshot/cluster_43:0.014440696481792714 - cluster/prob_snapshot/cluster_44:0.016151919014885153 - cluster/prob_snapshot/cluster_45:0.015090527823473386 - cluster/prob_snapshot/cluster_46:0.019927439110049853 - cluster/prob_snapshot/cluster_47:0.013790865140112043 - cluster/prob_snapshot/cluster_48:0.019927439110049853 - cluster/prob_snapshot/cluster_49:0.020447520793841618 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013316423277550983 - cluster/prob_snapshot/cluster_52:0.01706168289323809 - cluster/prob_snapshot/cluster_53:0.013286162798073386 - cluster/prob_snapshot/cluster_54:0.012274592009523808 - cluster/prob_snapshot/cluster_55:0.018441491442073382 - cluster/prob_snapshot/cluster_56:0.01843395556461436 - cluster/prob_snapshot/cluster_57:0.014629364181327336 - cluster/prob_snapshot/cluster_58:0.01691259707055658 - cluster/prob_snapshot/cluster_59:0.020083463615187383 - cluster/prob_snapshot/cluster_60:0.010758318878935573 - cluster/prob_snapshot/cluster_61:0.010758318878935573 - cluster/prob_snapshot/cluster_62:0.020811577972495852 - cluster/prob_snapshot/cluster_63:0.016151919014885153
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▍        | 112/800 [3:26:38<20:50:49, 109.08s/it]
[36m(TaskRunner pid=2823680)[0m step:112 - global_seqlen/min:354151 - global_seqlen/max:464792 - global_seqlen/minmax_diff:110641 - global_seqlen/balanced_min:392822 - global_seqlen/balanced_max:393005 - global_seqlen/mean:392912.0 - frontier/skipped_zero_acc_count:42.0 - actor/entropy:np.float64(0.23309168579100176) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010077234357595444 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05081461282679811) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002733656018007909) - actor/ppo_kl:np.float64(3.083127401010262e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2074328295209191) - perf/mfu/actor:np.float64(0.25284026178318997) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.56029319763184) - actor/lr:np.float64(1e-06) - training/global_step:112 - training/epoch:0 - critic/score/mean:0.5334302186965942 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5242785215377808 - critic/rewards/max:1.0037250518798828 - critic/rewards/min:-0.06439285725355148 - critic/advantages/mean:-0.147700235247612 - critic/advantages/max:2.474832534790039 - critic/advantages/min:-2.4748518466949463 - critic/returns/mean:-0.147700235247612 - critic/returns/max:2.474832534790039 - critic/returns/min:-2.4748518466949463 - response_length/mean:1130.024658203125 - response_length/max:8192.0 - response_length/min:255.0 - response_length/clip_ratio:0.004360465332865715 - response_length_non_aborted/mean:1130.024658203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:255.0 - response_length_non_aborted/clip_ratio:0.004360465332865715 - response/aborted_ratio:0.0 - prompt_length/mean:239.46511840820312 - prompt_length/max:355.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.401647210121155e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6805864684283733) - timing_s/agent_loop/generate_sequences/max:np.float64(28.35120055731386) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.216935580046993) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.35120055731386) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:263 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.961756790056825 - timing_s/reward:0.0011873282492160797 - timing_s/old_log_prob:7.9740955559536815 - timing_s/ref:20.841733411885798 - timing_s/adv:0.0721439030021429 - timing_s/update_actor:18.150128992274404 - timing_s/update_weights:26.843745606020093 - timing_s/step:104.26145396381617 - timing_s/stop_profile:6.378348916769028e-05 - timing_per_token_ms/adv:7.656889607522631e-05 - timing_per_token_ms/update_actor:0.019263378923651125 - timing_per_token_ms/gen:0.0385381529654461 - timing_per_token_ms/ref:0.02212007464573762 - perf/total_num_tokens:1571648 - perf/time_per_step:104.26145396381617 - perf/throughput:3768.5259994202625 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:646.0 - frontier/mean_score:2.3044164897745003 - frontier/mean_frontier_pct:0.26728758798741065 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.6264109999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.763 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.6769999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.09 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.7321299999999997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.9182939699999992 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:1.2401 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.201789999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.49 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.0765509999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.5717524750999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.1598999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:2.9176456999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.1481519899999992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.5540999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.5340999999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.763 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.1815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.5540999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:1.8659000000000001 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:2.9596463929999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.2319299999999993 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.7598999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.956014568369999 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.5340999999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.237 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.237 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.7598999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.8823509999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.844291 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.3629999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.0878699999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.5530562999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.0261299999999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.3423519899999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8470562999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:1.49 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.8823509999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.237 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:112.0 - cluster/prob_snapshot/cluster_0:0.026227977278207956 - cluster/prob_snapshot/cluster_1:0.012750877917996782 - cluster/prob_snapshot/cluster_2:0.008969009475954515 - cluster/prob_snapshot/cluster_3:0.02659386165880554 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015115901785940598 - cluster/prob_snapshot/cluster_6:0.012527610985876214 - cluster/prob_snapshot/cluster_7:0.021106528723934294 - cluster/prob_snapshot/cluster_8:0.019216722988155105 - cluster/prob_snapshot/cluster_9:0.008969009475954515 - cluster/prob_snapshot/cluster_10:0.030389399504855186 - cluster/prob_snapshot/cluster_11:0.010776408450263873 - cluster/prob_snapshot/cluster_12:0.015018632042821401 - cluster/prob_snapshot/cluster_13:0.019960946095223663 - cluster/prob_snapshot/cluster_14:0.02583266010397217 - cluster/prob_snapshot/cluster_15:0.020017400733774147 - cluster/prob_snapshot/cluster_16:0.015621452759546935 - cluster/prob_snapshot/cluster_17:0.009713232583023075 - cluster/prob_snapshot/cluster_18:0.021101840118359763 - cluster/prob_snapshot/cluster_19:0.02276897430050404 - cluster/prob_snapshot/cluster_20:0.013814053785237581 - cluster/prob_snapshot/cluster_21:0.015115901785940598 - cluster/prob_snapshot/cluster_22:0.009713232583023075 - cluster/prob_snapshot/cluster_23:0.018472499881086544 - cluster/prob_snapshot/cluster_24:0.019216722988155105 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.011095361210436111 - cluster/prob_snapshot/cluster_27:0.015621452759546935 - cluster/prob_snapshot/cluster_28:0.016634724453427452 - cluster/prob_snapshot/cluster_29:0.012750877917996782 - cluster/prob_snapshot/cluster_30:0.01577773961203133 - cluster/prob_snapshot/cluster_31:0.019216722988155105 - cluster/prob_snapshot/cluster_32:0.010776408450263873 - cluster/prob_snapshot/cluster_33:0.018472499881086544 - cluster/prob_snapshot/cluster_34:0.013495101025065343 - cluster/prob_snapshot/cluster_35:0.021405609663971936 - cluster/prob_snapshot/cluster_36:0.016142408934494923 - cluster/prob_snapshot/cluster_37:0.014133006545409818 - cluster/prob_snapshot/cluster_38:0.019960946095223663 - cluster/prob_snapshot/cluster_39:0.021379342532674882 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.019960946095223663 - cluster/prob_snapshot/cluster_42:0.011095361210436111 - cluster/prob_snapshot/cluster_43:0.01446497778558909 - cluster/prob_snapshot/cluster_44:0.016179077653181398 - cluster/prob_snapshot/cluster_45:0.015115901785940598 - cluster/prob_snapshot/cluster_46:0.019960946095223663 - cluster/prob_snapshot/cluster_47:0.016179077653181398 - cluster/prob_snapshot/cluster_48:0.019960946095223663 - cluster/prob_snapshot/cluster_49:0.020846571592635248 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013338814172580944 - cluster/prob_snapshot/cluster_52:0.017090371253673506 - cluster/prob_snapshot/cluster_53:0.013308502811631241 - cluster/prob_snapshot/cluster_54:0.012295231117750726 - cluster/prob_snapshot/cluster_55:0.015100496584598945 - cluster/prob_snapshot/cluster_56:0.018464951332429137 - cluster/prob_snapshot/cluster_57:0.01465396272035781 - cluster/prob_snapshot/cluster_58:0.016941034750690198 - cluster/prob_snapshot/cluster_59:0.02059130306691073 - cluster/prob_snapshot/cluster_60:0.010776408450263873 - cluster/prob_snapshot/cluster_61:0.010776408450263873 - cluster/prob_snapshot/cluster_62:0.020846571592635248 - cluster/prob_snapshot/cluster_63:0.016179077653181398
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▍        | 113/800 [3:28:22<20:31:48, 107.58s/it]
[36m(TaskRunner pid=2823680)[0m step:113 - global_seqlen/min:347750 - global_seqlen/max:408442 - global_seqlen/minmax_diff:60692 - global_seqlen/balanced_min:370737 - global_seqlen/balanced_max:370875 - global_seqlen/mean:370813.5 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.2569087968416968) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010760213248431683 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08271627131762216) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00030894896665312427) - actor/ppo_kl:np.float64(3.264409701609928e-05) - actor/pg_clipfrac_lower:np.float64(9.93581459030737e-07) - actor/grad_norm:np.float64(0.2911039888858795) - perf/mfu/actor:np.float64(0.20845250510414823) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.08338928222656) - actor/lr:np.float64(1e-06) - training/global_step:113 - training/epoch:0 - critic/score/mean:0.5306122303009033 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5218949317932129 - critic/rewards/max:1.0035761594772339 - critic/rewards/min:-0.04479747265577316 - critic/advantages/mean:-0.11585846543312073 - critic/advantages/max:2.474832057952881 - critic/advantages/min:-2.474846124649048 - critic/returns/mean:-0.11585846543312073 - critic/returns/max:2.474832057952881 - critic/returns/min:-2.474846124649048 - response_length/mean:1100.8826904296875 - response_length/max:8192.0 - response_length/min:86.0 - response_length/clip_ratio:0.0076530613005161285 - response_length_non_aborted/mean:1100.8826904296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:86.0 - response_length_non_aborted/clip_ratio:0.0076530613005161285 - response/aborted_ratio:0.0 - prompt_length/mean:239.2857208251953 - prompt_length/max:505.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.417479693889618e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.7889937609434128) - timing_s/agent_loop/generate_sequences/max:np.float64(27.38321707211435) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.465655844940557) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.38321707211435) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:249 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.48175223916769 - timing_s/reward:0.00011377967894077301 - timing_s/old_log_prob:9.837532605975866 - timing_s/ref:17.625377509742975 - timing_s/adv:0.10095821134746075 - timing_s/update_actor:20.89100203383714 - timing_s/update_weights:25.562247341498733 - timing_s/step:103.88173741195351 - timing_s/stop_profile:5.118269473314285e-05 - timing_per_token_ms/adv:9.60873513336551e-05 - timing_per_token_ms/update_actor:0.019883088511035717 - timing_per_token_ms/gen:0.034158296264091996 - timing_per_token_ms/ref:0.01677501828294398 - perf/total_num_tokens:1483254 - perf/time_per_step:103.88173741195351 - perf/throughput:3569.573528882191 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:676.0 - frontier/mean_score:2.299694330434 - frontier/mean_frontier_pct:0.28374197417605085 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.763 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.6769999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.09 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.7321299999999997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.9182939699999992 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:1.7680699999999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.441253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.49 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:1.7535856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.4002267325699993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:1.8119299999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:2.9176456999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.1481519899999992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.5540999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9738699999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.763 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.1815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.343 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.5540999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.20613 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.2319299999999993 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.2319299999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.956014568369999 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.5340999999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.237 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.237 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.7598999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.8823509999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.844291 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.3629999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.0878699999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.5530562999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:1.7182909999999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.3423519899999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8470562999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.9176456999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.237 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:113.0 - cluster/prob_snapshot/cluster_0:0.024919889384828853 - cluster/prob_snapshot/cluster_1:0.012777060387755137 - cluster/prob_snapshot/cluster_2:0.008987426311318858 - cluster/prob_snapshot/cluster_3:0.026648469112748516 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015146940561774382 - cluster/prob_snapshot/cluster_6:0.01255333500251974 - cluster/prob_snapshot/cluster_7:0.021149868567164873 - cluster/prob_snapshot/cluster_8:0.01925618233140408 - cluster/prob_snapshot/cluster_9:0.012813804401462407 - cluster/prob_snapshot/cluster_10:0.032187270435790505 - cluster/prob_snapshot/cluster_11:0.010798536572748244 - cluster/prob_snapshot/cluster_12:0.01270883175496532 - cluster/prob_snapshot/cluster_13:0.020523959514404652 - cluster/prob_snapshot/cluster_14:0.0246425991458345 - cluster/prob_snapshot/cluster_15:0.020058504177217298 - cluster/prob_snapshot/cluster_16:0.013131672733060217 - cluster/prob_snapshot/cluster_17:0.009733177595436841 - cluster/prob_snapshot/cluster_18:0.021145170334074932 - cluster/prob_snapshot/cluster_19:0.022815727785627623 - cluster/prob_snapshot/cluster_20:0.013842419365066541 - cluster/prob_snapshot/cluster_21:0.015146940561774382 - cluster/prob_snapshot/cluster_22:0.009733177595436841 - cluster/prob_snapshot/cluster_23:0.0185104310472861 - cluster/prob_snapshot/cluster_24:0.01925618233140408 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.014305306969698372 - cluster/prob_snapshot/cluster_27:0.015653529626495924 - cluster/prob_snapshot/cluster_28:0.01819082335409268 - cluster/prob_snapshot/cluster_29:0.012777060387755137 - cluster/prob_snapshot/cluster_30:0.0158101373961607 - cluster/prob_snapshot/cluster_31:0.01925618233140408 - cluster/prob_snapshot/cluster_32:0.009733177595436841 - cluster/prob_snapshot/cluster_33:0.0185104310472861 - cluster/prob_snapshot/cluster_34:0.015988574153850393 - cluster/prob_snapshot/cluster_35:0.021537300528538562 - cluster/prob_snapshot/cluster_36:0.016175555525378513 - cluster/prob_snapshot/cluster_37:0.014162027058259962 - cluster/prob_snapshot/cluster_38:0.016175555525378513 - cluster/prob_snapshot/cluster_39:0.02142324256786581 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.020001933615522063 - cluster/prob_snapshot/cluster_42:0.011118144265941664 - cluster/prob_snapshot/cluster_43:0.014494679963420463 - cluster/prob_snapshot/cluster_44:0.01621229953908579 - cluster/prob_snapshot/cluster_45:0.015146940561774382 - cluster/prob_snapshot/cluster_46:0.020001933615522063 - cluster/prob_snapshot/cluster_47:0.01621229953908579 - cluster/prob_snapshot/cluster_48:0.020001933615522063 - cluster/prob_snapshot/cluster_49:0.02088937764362246 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013366203902208343 - cluster/prob_snapshot/cluster_52:0.017125464376781273 - cluster/prob_snapshot/cluster_53:0.013335830300344995 - cluster/prob_snapshot/cluster_54:0.012320477968907393 - cluster/prob_snapshot/cluster_55:0.015131503727613339 - cluster/prob_snapshot/cluster_56:0.01850286699854719 - cluster/prob_snapshot/cluster_57:0.012453039064512854 - cluster/prob_snapshot/cluster_58:0.01697582122836552 - cluster/prob_snapshot/cluster_59:0.020633584953169994 - cluster/prob_snapshot/cluster_60:0.010798536572748244 - cluster/prob_snapshot/cluster_61:0.009733177595436841 - cluster/prob_snapshot/cluster_62:0.021145170334074932 - cluster/prob_snapshot/cluster_63:0.01621229953908579
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▍        | 114/800 [3:30:05<20:14:09, 106.19s/it]
[36m(TaskRunner pid=2823680)[0m step:114 - global_seqlen/min:343903 - global_seqlen/max:375205 - global_seqlen/minmax_diff:31302 - global_seqlen/balanced_min:360038 - global_seqlen/balanced_max:360454 - global_seqlen/mean:360311.75 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.2343478288501501) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009863105602562428 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.023451763001503423) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005668483255431056) - actor/ppo_kl:np.float64(5.7208100805041796e-05) - actor/pg_clipfrac_lower:np.float64(8.094544318737463e-07) - actor/grad_norm:np.float64(0.22350168457398048) - perf/mfu/actor:np.float64(0.2031909623378027) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.22851181030273) - actor/lr:np.float64(1e-06) - training/global_step:114 - training/epoch:0 - critic/score/mean:0.5808081030845642 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5723754167556763 - critic/rewards/max:1.002878189086914 - critic/rewards/min:-0.03039422631263733 - critic/advantages/mean:-0.08162937313318253 - critic/advantages/max:2.474836587905884 - critic/advantages/min:-2.4748573303222656 - critic/returns/mean:-0.08162937313318253 - critic/returns/max:2.474836587905884 - critic/returns/min:-2.4748573303222656 - response_length/mean:1063.6199951171875 - response_length/max:8192.0 - response_length/min:241.0 - response_length/clip_ratio:0.0037878789007663727 - response_length_non_aborted/mean:1063.6199951171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:241.0 - response_length_non_aborted/clip_ratio:0.0037878789007663727 - response/aborted_ratio:0.0 - prompt_length/mean:234.6666717529297 - prompt_length/max:357.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00013348553329706192 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8065721979364753) - timing_s/agent_loop/generate_sequences/max:np.float64(27.01962192542851) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.614748104727369) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.01962192542851) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:208 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.77747254818678 - timing_s/reward:0.00013207271695137024 - timing_s/old_log_prob:9.389922261238098 - timing_s/ref:18.649426186457276 - timing_s/adv:0.08020320348441601 - timing_s/update_actor:20.610516913235188 - timing_s/update_weights:24.83180832862854 - timing_s/step:102.73817046359181 - timing_s/stop_profile:5.730707198381424e-05 - timing_per_token_ms/adv:7.800024263176701e-05 - timing_per_token_ms/update_actor:0.020044402843720003 - timing_per_token_ms/gen:0.03416181938727304 - timing_per_token_ms/ref:0.018137177871823368 - perf/total_num_tokens:1441247 - perf/time_per_step:102.73817046359181 - perf/throughput:3507.087466850373 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:705.0 - frontier/mean_score:2.31357337934215 - frontier/mean_frontier_pct:0.30620536240691776 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:1.763 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.6769999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.09 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:2.1124909999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.342805778999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.7598999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:1.7680699999999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.441253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.49 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:1.7535856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:3.4823509999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.4002267325699993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:1.8119299999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:2.9423519899999997 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.103706392999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.6878699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.3598999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9738699999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.763 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.1815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.343 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.6878699999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.20613 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.8623509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.2319299999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.9692101978589993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.5340999999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.237 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.237 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.7598999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.9176456999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.844291 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.5540999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.0878699999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.5530562999999997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.5028036999999999 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9396463929999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8470562999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.343 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.9176456999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.237 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:114.0 - cluster/prob_snapshot/cluster_0:0.02477039580634721 - cluster/prob_snapshot/cluster_1:0.012700411232121065 - cluster/prob_snapshot/cluster_2:0.008933511043081868 - cluster/prob_snapshot/cluster_3:0.026488605842603038 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015056074574664225 - cluster/prob_snapshot/cluster_6:0.015218096667132534 - cluster/prob_snapshot/cluster_7:0.01687725288161641 - cluster/prob_snapshot/cluster_8:0.019881942688332915 - cluster/prob_snapshot/cluster_9:0.012736934819725633 - cluster/prob_snapshot/cluster_10:0.031994180082751775 - cluster/prob_snapshot/cluster_11:0.010733756514952007 - cluster/prob_snapshot/cluster_12:0.012632591900605148 - cluster/prob_snapshot/cluster_13:0.025086381029261497 - cluster/prob_snapshot/cluster_14:0.02449476902217262 - cluster/prob_snapshot/cluster_15:0.019938173885130736 - cluster/prob_snapshot/cluster_16:0.013052896269890596 - cluster/prob_snapshot/cluster_17:0.009674788590322513 - cluster/prob_snapshot/cluster_18:0.021196301907345304 - cluster/prob_snapshot/cluster_19:0.022358676990846937 - cluster/prob_snapshot/cluster_20:0.013759379156750559 - cluster/prob_snapshot/cluster_21:0.015056074574664225 - cluster/prob_snapshot/cluster_22:0.009674788590322513 - cluster/prob_snapshot/cluster_23:0.019363048405264458 - cluster/prob_snapshot/cluster_24:0.02420426074804513 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.014219489914206924 - cluster/prob_snapshot/cluster_27:0.015559624628620695 - cluster/prob_snapshot/cluster_28:0.018081697216462776 - cluster/prob_snapshot/cluster_29:0.012700411232121065 - cluster/prob_snapshot/cluster_30:0.015715292913541228 - cluster/prob_snapshot/cluster_31:0.019140665141092267 - cluster/prob_snapshot/cluster_32:0.009674788590322513 - cluster/prob_snapshot/cluster_33:0.019363048405264458 - cluster/prob_snapshot/cluster_34:0.015892659235121523 - cluster/prob_snapshot/cluster_35:0.02140809898686535 - cluster/prob_snapshot/cluster_36:0.01341612226803851 - cluster/prob_snapshot/cluster_37:0.014077069534139406 - cluster/prob_snapshot/cluster_38:0.016078518911689144 - cluster/prob_snapshot/cluster_39:0.021389784768812734 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.020400836971401364 - cluster/prob_snapshot/cluster_42:0.011051446892340853 - cluster/prob_snapshot/cluster_43:0.014407726865707392 - cluster/prob_snapshot/cluster_44:0.01611504249929372 - cluster/prob_snapshot/cluster_45:0.015056074574664225 - cluster/prob_snapshot/cluster_46:0.019881942688332915 - cluster/prob_snapshot/cluster_47:0.01611504249929372 - cluster/prob_snapshot/cluster_48:0.019881942688332915 - cluster/prob_snapshot/cluster_49:0.02101832116825282 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013286020494441176 - cluster/prob_snapshot/cluster_52:0.018399387593851624 - cluster/prob_snapshot/cluster_53:0.013255829102794086 - cluster/prob_snapshot/cluster_54:0.012246567835851284 - cluster/prob_snapshot/cluster_55:0.015040730345552245 - cluster/prob_snapshot/cluster_56:0.018391868921586754 - cluster/prob_snapshot/cluster_57:0.010825992621187235 - cluster/prob_snapshot/cluster_58:0.013972947723199267 - cluster/prob_snapshot/cluster_59:0.020509804770845737 - cluster/prob_snapshot/cluster_60:0.010733756514952007 - cluster/prob_snapshot/cluster_61:0.009674788590322513 - cluster/prob_snapshot/cluster_62:0.02101832116825282 - cluster/prob_snapshot/cluster_63:0.01611504249929372
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▍        | 115/800 [3:31:53<20:19:39, 106.83s/it]
[36m(TaskRunner pid=2823680)[0m step:115 - global_seqlen/min:325912 - global_seqlen/max:433438 - global_seqlen/minmax_diff:107526 - global_seqlen/balanced_min:385686 - global_seqlen/balanced_max:385804 - global_seqlen/mean:385756.0 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.2375531311845407) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010676153004169464 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.10762998402060475) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00037394848709482176) - actor/ppo_kl:np.float64(5.744713817712466e-05) - actor/pg_clipfrac_lower:np.float64(1.0039192754144703e-06) - actor/grad_norm:np.float64(0.2343012789885203) - perf/mfu/actor:np.float64(0.21751107273636833) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.6783332824707) - actor/lr:np.float64(1e-06) - training/global_step:115 - training/epoch:0 - critic/score/mean:0.574999988079071 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5660274028778076 - critic/rewards/max:1.0012911558151245 - critic/rewards/min:-0.054406631737947464 - critic/advantages/mean:-0.14175589382648468 - critic/advantages/max:2.4748566150665283 - critic/advantages/min:-2.4748477935791016 - critic/returns/mean:-0.14175589382648468 - critic/returns/max:2.4748566150665283 - critic/returns/min:-2.4748477935791016 - response_length/mean:1116.2381591796875 - response_length/max:8192.0 - response_length/min:129.0 - response_length/clip_ratio:0.009210526011884212 - response_length_non_aborted/mean:1116.2381591796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:129.0 - response_length_non_aborted/clip_ratio:0.009210526011884212 - response/aborted_ratio:0.0 - prompt_length/mean:243.0631561279297 - prompt_length/max:555.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010614749044179916 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0174550889059901) - timing_s/agent_loop/generate_sequences/max:np.float64(28.61649683956057) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.8304590484194705) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.61649683956057) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.607415916398168 - timing_s/reward:0.00028676073998212814 - timing_s/old_log_prob:10.186566282063723 - timing_s/ref:19.60121981985867 - timing_s/adv:0.07602029852569103 - timing_s/update_actor:20.951710725203156 - timing_s/update_weights:26.287007994949818 - timing_s/step:108.09146486315876 - timing_s/stop_profile:6.66789710521698e-05 - timing_per_token_ms/adv:7.358685482353166e-05 - timing_per_token_ms/update_actor:0.020281037109044175 - timing_per_token_ms/gen:0.03607914260468157 - timing_per_token_ms/ref:0.01897377602063238 - perf/total_num_tokens:1543024 - perf/time_per_step:108.09146486315876 - perf/throughput:3568.7924156487097 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:738.0 - frontier/mean_score:2.3829669690374358 - frontier/mean_frontier_pct:0.30742401814869896 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.6769999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.09 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:2.1124909999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.342805778999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.7598999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.1376489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.441253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.49 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.7275099899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:3.3376456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.4002267325699993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.767705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:1.8119299999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.343 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:2.9596463929999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.103706392999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:1.91 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.6878699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.8519299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9738699999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.6569999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.763 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.1815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:2.1598999999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.343 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.7815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.20613 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.8623509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.2678699999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.2319299999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.9784471385012994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.5340999999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.237 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.4659 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.7598999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.9176456999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.844291 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.5540999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.0878699999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.6871394099999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.5028036999999999 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9396463929999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8470562999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.9176456999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.237 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:115.0 - cluster/prob_snapshot/cluster_0:0.02487834345165186 - cluster/prob_snapshot/cluster_1:0.011099608321757009 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02660404132657618 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015121687890275829 - cluster/prob_snapshot/cluster_6:0.015284416063644338 - cluster/prob_snapshot/cluster_7:0.01695080276438876 - cluster/prob_snapshot/cluster_8:0.019968586798264236 - cluster/prob_snapshot/cluster_9:0.015466440668402024 - cluster/prob_snapshot/cluster_10:0.032133608472608224 - cluster/prob_snapshot/cluster_11:0.010780533472014825 - cluster/prob_snapshot/cluster_12:0.019734236739899207 - cluster/prob_snapshot/cluster_13:0.02414872562857473 - cluster/prob_snapshot/cluster_14:0.024601515505309045 - cluster/prob_snapshot/cluster_15:0.0200250630466686 - cluster/prob_snapshot/cluster_16:0.013109779875132763 - cluster/prob_snapshot/cluster_17:0.009716950639540879 - cluster/prob_snapshot/cluster_18:0.02141380335910365 - cluster/prob_snapshot/cluster_19:0.022456114534928115 - cluster/prob_snapshot/cluster_20:0.013819341564797527 - cluster/prob_snapshot/cluster_21:0.015121687890275829 - cluster/prob_snapshot/cluster_22:0.009716950639540879 - cluster/prob_snapshot/cluster_23:0.019447431210352 - cluster/prob_snapshot/cluster_24:0.027869704897220175 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.01428145745262141 - cluster/prob_snapshot/cluster_27:0.015627432380003234 - cluster/prob_snapshot/cluster_28:0.019224078815532473 - cluster/prob_snapshot/cluster_29:0.012755758732323581 - cluster/prob_snapshot/cluster_30:0.0157837790563769 - cluster/prob_snapshot/cluster_31:0.015627432380003234 - cluster/prob_snapshot/cluster_32:0.009716950639540879 - cluster/prob_snapshot/cluster_33:0.020124933474637906 - cluster/prob_snapshot/cluster_34:0.015961918327930247 - cluster/prob_snapshot/cluster_35:0.02150139397876406 - cluster/prob_snapshot/cluster_36:0.013474588786671326 - cluster/prob_snapshot/cluster_37:0.0164086231175693 - cluster/prob_snapshot/cluster_38:0.016148587967915465 - cluster/prob_snapshot/cluster_39:0.021549831591436266 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.020489742386176468 - cluster/prob_snapshot/cluster_42:0.011099608321757009 - cluster/prob_snapshot/cluster_43:0.014470514727536679 - cluster/prob_snapshot/cluster_44:0.016185270722749775 - cluster/prob_snapshot/cluster_45:0.015121687890275829 - cluster/prob_snapshot/cluster_46:0.020489742386176468 - cluster/prob_snapshot/cluster_47:0.017841421133316347 - cluster/prob_snapshot/cluster_48:0.019968586798264236 - cluster/prob_snapshot/cluster_49:0.021109917535792028 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013343920038681673 - cluster/prob_snapshot/cluster_52:0.01847957083280071 - cluster/prob_snapshot/cluster_53:0.01331359707507012 - cluster/prob_snapshot/cluster_54:0.012299937518406176 - cluster/prob_snapshot/cluster_55:0.015106276792091 - cluster/prob_snapshot/cluster_56:0.019442145203674606 - cluster/prob_snapshot/cluster_57:0.010873171536723306 - cluster/prob_snapshot/cluster_58:0.014033840848059946 - cluster/prob_snapshot/cluster_59:0.020599185059638037 - cluster/prob_snapshot/cluster_60:0.010780533472014825 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.021109917535792028 - cluster/prob_snapshot/cluster_63:0.016185270722749775
[36m(TaskRunner pid=2823680)[0m Training Progress:  14%|█▍        | 116/800 [3:33:51<20:55:30, 110.13s/it]
[36m(TaskRunner pid=2823680)[0m step:116 - global_seqlen/min:364203 - global_seqlen/max:448264 - global_seqlen/minmax_diff:84061 - global_seqlen/balanced_min:400294 - global_seqlen/balanced_max:400386 - global_seqlen/mean:400333.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.25509321751693886) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009656279347836971 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.019459561968687922) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005014269537089907) - actor/ppo_kl:np.float64(8.084807325599759e-05) - actor/pg_clipfrac_lower:np.float64(4.543973015339968e-06) - actor/grad_norm:np.float64(0.2172674909234047) - perf/mfu/actor:np.float64(0.1996134868968332) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.76574325561523) - actor/lr:np.float64(1e-06) - training/global_step:116 - training/epoch:0 - critic/score/mean:0.4700520932674408 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4603828191757202 - critic/rewards/max:1.0127776861190796 - critic/rewards/min:-0.05139251798391342 - critic/advantages/mean:-0.1540638953447342 - critic/advantages/max:2.474855899810791 - critic/advantages/min:-2.474851131439209 - critic/returns/mean:-0.1540638953447342 - critic/returns/max:2.474855899810791 - critic/returns/min:-2.474851131439209 - response_length/mean:1247.1861572265625 - response_length/max:8192.0 - response_length/min:209.0 - response_length/clip_ratio:0.0078125 - response_length_non_aborted/mean:1247.1861572265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:209.0 - response_length_non_aborted/clip_ratio:0.0078125 - response/aborted_ratio:0.0 - prompt_length/mean:234.3333282470703 - prompt_length/max:340.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.57161357998848e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.9205822022631764) - timing_s/agent_loop/generate_sequences/max:np.float64(28.819741709157825) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.690866437235854) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.819741709157825) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.827555818483233 - timing_s/reward:0.00013499148190021515 - timing_s/old_log_prob:10.89451054111123 - timing_s/ref:23.190736931748688 - timing_s/adv:0.09309776965528727 - timing_s/update_actor:23.43996067624539 - timing_s/update_weights:28.77732507698238 - timing_s/step:117.62775212433189 - timing_s/stop_profile:5.963817238807678e-05 - timing_per_token_ms/adv:8.18221101252561e-05 - timing_per_token_ms/update_actor:0.020600998830421498 - timing_per_token_ms/gen:0.03218448592976819 - timing_per_token_ms/ref:0.020381960149435437 - perf/total_num_tokens:1601334 - perf/time_per_step:117.62775212433189 - perf/throughput:3403.393270466052 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:770.0 - frontier/mean_score:2.3742066912788156 - frontier/mean_frontier_pct:0.32720015656654444 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.4738999999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:2.3629999999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:2.1124909999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.342805778999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.7598999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.1376489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.441253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.49 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.7275099899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:3.3376456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.4002267325699993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.837393989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:1.8119299999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:2.9596463929999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.103706392999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8369999999999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.6878699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.8519299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9738699999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.7598999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.763 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.8119299999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.2470562999999992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.4442909999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.8623509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.2678699999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.2319299999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.9784471385012994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.5340999999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.4659 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.4659 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.2319299999999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.9176456999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:1.844291 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.5540999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.0878699999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.6871394099999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.5028036999999999 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9396463929999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8470562999999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.9176456999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.8659000000000001 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:116.0 - cluster/prob_snapshot/cluster_0:0.02497013882886623 - cluster/prob_snapshot/cluster_1:0.011140563328862185 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.025227301315516815 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017159996835995922 - cluster/prob_snapshot/cluster_6:0.015340812050812466 - cluster/prob_snapshot/cluster_7:0.017013347336010558 - cluster/prob_snapshot/cluster_8:0.0200422663003238 - cluster/prob_snapshot/cluster_9:0.015523508284583092 - cluster/prob_snapshot/cluster_10:0.03225217411250842 - cluster/prob_snapshot/cluster_11:0.010820311166159089 - cluster/prob_snapshot/cluster_12:0.019807051544031853 - cluster/prob_snapshot/cluster_13:0.024237828883485142 - cluster/prob_snapshot/cluster_14:0.024692289450939462 - cluster/prob_snapshot/cluster_15:0.02060502407569777 - cluster/prob_snapshot/cluster_16:0.0131581519538917 - cluster/prob_snapshot/cluster_17:0.009005548910841535 - cluster/prob_snapshot/cluster_18:0.021492815378564 - cluster/prob_snapshot/cluster_19:0.02253897244339412 - cluster/prob_snapshot/cluster_20:0.020602162938519016 - cluster/prob_snapshot/cluster_21:0.015177483447833888 - cluster/prob_snapshot/cluster_22:0.009005548910841535 - cluster/prob_snapshot/cluster_23:0.01951918776790874 - cluster/prob_snapshot/cluster_24:0.02797253771158602 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.01433415275271573 - cluster/prob_snapshot/cluster_27:0.015685094018649 - cluster/prob_snapshot/cluster_28:0.0200422663003238 - cluster/prob_snapshot/cluster_29:0.012802824554321123 - cluster/prob_snapshot/cluster_30:0.013267998445698863 - cluster/prob_snapshot/cluster_31:0.0131581519538917 - cluster/prob_snapshot/cluster_32:0.009005548910841535 - cluster/prob_snapshot/cluster_33:0.01631801904287122 - cluster/prob_snapshot/cluster_34:0.017750328322578633 - cluster/prob_snapshot/cluster_35:0.021580729187507 - cluster/prob_snapshot/cluster_36:0.013524306926582241 - cluster/prob_snapshot/cluster_37:0.01646916717073638 - cluster/prob_snapshot/cluster_38:0.01620817255106406 - cluster/prob_snapshot/cluster_39:0.02162934552385248 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02056534483273886 - cluster/prob_snapshot/cluster_42:0.011140563328862185 - cluster/prob_snapshot/cluster_43:0.016702493746420067 - cluster/prob_snapshot/cluster_44:0.017907251882303153 - cluster/prob_snapshot/cluster_45:0.015177483447833888 - cluster/prob_snapshot/cluster_46:0.02056534483273886 - cluster/prob_snapshot/cluster_47:0.017907251882303153 - cluster/prob_snapshot/cluster_48:0.01620817255106406 - cluster/prob_snapshot/cluster_49:0.02118780828631278 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013393156040903832 - cluster/prob_snapshot/cluster_52:0.018547756207709345 - cluster/prob_snapshot/cluster_53:0.013362721192516334 - cluster/prob_snapshot/cluster_54:0.01234532146474527 - cluster/prob_snapshot/cluster_55:0.015162015486233941 - cluster/prob_snapshot/cluster_56:0.019513882257079958 - cluster/prob_snapshot/cluster_57:0.010913291044063887 - cluster/prob_snapshot/cluster_58:0.014085622499716843 - cluster/prob_snapshot/cluster_59:0.02067519132454602 - cluster/prob_snapshot/cluster_60:0.010820311166159089 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.02118780828631278 - cluster/prob_snapshot/cluster_63:0.013550079600628352
[36m(TaskRunner pid=2823680)[0m Training Progress:  15%|█▍        | 117/800 [3:35:40<20:49:40, 109.78s/it]
[36m(TaskRunner pid=2823680)[0m step:117 - global_seqlen/min:328667 - global_seqlen/max:415700 - global_seqlen/minmax_diff:87033 - global_seqlen/balanced_min:362170 - global_seqlen/balanced_max:362243 - global_seqlen/mean:362219.5 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.2725574728877594) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011196905747056007 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.060277602868154645) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00033609431481333257) - actor/ppo_kl:np.float64(8.013509487684682e-05) - actor/pg_clipfrac_lower:np.float64(1.8785016209221794e-07) - actor/grad_norm:np.float64(0.2370245928565661) - perf/mfu/actor:np.float64(0.20913344910879872) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.58528900146484) - actor/lr:np.float64(1e-06) - training/global_step:117 - training/epoch:0 - critic/score/mean:0.553947389125824 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5447049736976624 - critic/rewards/max:1.0035635232925415 - critic/rewards/min:-0.09639599919319153 - critic/advantages/mean:-0.17713662981987 - critic/advantages/max:2.474832773208618 - critic/advantages/min:-2.4748594760894775 - critic/returns/mean:-0.17713662981987 - critic/returns/max:2.474832773208618 - critic/returns/min:-2.4748594760894775 - response_length/mean:1126.902587890625 - response_length/max:8192.0 - response_length/min:201.0 - response_length/clip_ratio:0.010526316240429878 - response_length_non_aborted/mean:1126.902587890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:201.0 - response_length_non_aborted/clip_ratio:0.010526316240429878 - response/aborted_ratio:0.0 - prompt_length/mean:243.29473876953125 - prompt_length/max:369.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.869264274835587e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.131372327916324) - timing_s/agent_loop/generate_sequences/max:np.float64(28.41796875372529) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.90799621372571) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.41796875372529) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:267 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.944081261754036 - timing_s/reward:0.00020416360348463058 - timing_s/old_log_prob:9.335125534795225 - timing_s/ref:20.597174896858633 - timing_s/adv:0.06841790489852428 - timing_s/update_actor:20.263348203152418 - timing_s/update_weights:28.127430487424135 - timing_s/step:108.73969912808388 - timing_s/stop_profile:6.140768527984619e-05 - timing_per_token_ms/adv:6.570116185578747e-05 - timing_per_token_ms/update_actor:0.019458729728863897 - timing_per_token_ms/gen:0.034963186542705596 - timing_per_token_ms/ref:0.01977930080843005 - perf/total_num_tokens:1448878 - perf/time_per_step:108.73969912808388 - perf/throughput:3331.069544098551 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:803.0 - frontier/mean_score:2.3869986919770914 - frontier/mean_frontier_pct:0.337407689490435 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.3317299999999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:2.3629999999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.7787436999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.539964045299999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.7598999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.1376489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.608877099999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.7275099899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:3.3376456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.4002267325699993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:2.837393989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:1.8119299999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:2.9596463929999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.072594475099999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8858999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.6878699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.8519299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9738699999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.7598999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.1340999999999997 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.8119299999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.2470562999999992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.4442909999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.8623509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.462350999999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.9784471385012994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.8823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.37387 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.4659 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.62613 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.2319299999999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.9176456999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.1910036999999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.5540999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.0878699999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.6871394099999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.5028036999999999 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.6577524750999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.8929394099999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.9176456999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.8659000000000001 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:117.0 - cluster/prob_snapshot/cluster_0:0.02483632307337022 - cluster/prob_snapshot/cluster_1:0.01108086070130693 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.024065208281314997 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0170680358758805 - cluster/prob_snapshot/cluster_6:0.012847931140751764 - cluster/prob_snapshot/cluster_7:0.018346253681179415 - cluster/prob_snapshot/cluster_8:0.019934859167940158 - cluster/prob_snapshot/cluster_9:0.015440317317833295 - cluster/prob_snapshot/cluster_10:0.03329008873902839 - cluster/prob_snapshot/cluster_11:0.01403436043454753 - cluster/prob_snapshot/cluster_12:0.0197009049348889 - cluster/prob_snapshot/cluster_13:0.024107937599906173 - cluster/prob_snapshot/cluster_14:0.024559962698955856 - cluster/prob_snapshot/cluster_15:0.020494601106782778 - cluster/prob_snapshot/cluster_16:0.01308763700574869 - cluster/prob_snapshot/cluster_17:0.008957287892373851 - cluster/prob_snapshot/cluster_18:0.021377634708270976 - cluster/prob_snapshot/cluster_19:0.022193462857824378 - cluster/prob_snapshot/cluster_20:0.02084496180034005 - cluster/prob_snapshot/cluster_21:0.015096146839014071 - cluster/prob_snapshot/cluster_22:0.008957287892373851 - cluster/prob_snapshot/cluster_23:0.01941458382975155 - cluster/prob_snapshot/cluster_24:0.027822632006508835 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.014257335579485504 - cluster/prob_snapshot/cluster_27:0.015601037108893056 - cluster/prob_snapshot/cluster_28:0.019934859167940158 - cluster/prob_snapshot/cluster_29:0.015414682760354032 - cluster/prob_snapshot/cluster_30:0.013196894826768297 - cluster/prob_snapshot/cluster_31:0.01308763700574869 - cluster/prob_snapshot/cluster_32:0.008957287892373851 - cluster/prob_snapshot/cluster_33:0.016230570268101267 - cluster/prob_snapshot/cluster_34:0.0176552037575505 - cluster/prob_snapshot/cluster_35:0.021465077384360334 - cluster/prob_snapshot/cluster_36:0.013451829742480713 - cluster/prob_snapshot/cluster_37:0.013633546901416558 - cluster/prob_snapshot/cluster_38:0.017785651801527812 - cluster/prob_snapshot/cluster_39:0.021513433184237752 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.020819327242860786 - cluster/prob_snapshot/cluster_42:0.009923513520438403 - cluster/prob_snapshot/cluster_43:0.016612984559680556 - cluster/prob_snapshot/cluster_44:0.01781128635900708 - cluster/prob_snapshot/cluster_45:0.015096146839014071 - cluster/prob_snapshot/cluster_46:0.02045513450612876 - cluster/prob_snapshot/cluster_47:0.018968633539875608 - cluster/prob_snapshot/cluster_48:0.01612131244708166 - cluster/prob_snapshot/cluster_49:0.021074262158573203 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.015825700277523027 - cluster/prob_snapshot/cluster_52:0.018448358201687002 - cluster/prob_snapshot/cluster_53:0.013291109951420953 - cluster/prob_snapshot/cluster_54:0.012279162500633456 - cluster/prob_snapshot/cluster_55:0.015080761770704453 - cluster/prob_snapshot/cluster_56:0.019409306751321354 - cluster/prob_snapshot/cluster_57:0.010854806375796005 - cluster/prob_snapshot/cluster_58:0.011974007075047184 - cluster/prob_snapshot/cluster_59:0.02089580771757451 - cluster/prob_snapshot/cluster_60:0.01076232477996697 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.021074262158573203 - cluster/prob_snapshot/cluster_63:0.01347746429995998
[36m(TaskRunner pid=2823680)[0m Training Progress:  15%|█▍        | 118/800 [3:37:22<20:19:58, 107.33s/it]
[36m(TaskRunner pid=2823680)[0m step:118 - global_seqlen/min:323094 - global_seqlen/max:352715 - global_seqlen/minmax_diff:29621 - global_seqlen/balanced_min:334619 - global_seqlen/balanced_max:334838 - global_seqlen/mean:334729.0 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.3218189001781866) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011888142675161362 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.010023809008998796) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00043373191601858707) - actor/ppo_kl:np.float64(5.270279869407091e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2613809121151765) - perf/mfu/actor:np.float64(0.19689936392342994) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.0786247253418) - actor/lr:np.float64(1e-06) - training/global_step:118 - training/epoch:0 - critic/score/mean:0.5625 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5533496141433716 - critic/rewards/max:1.0194405317306519 - critic/rewards/min:-0.03642024099826813 - critic/advantages/mean:-0.09330195188522339 - critic/advantages/max:2.4748189449310303 - critic/advantages/min:-2.4748547077178955 - critic/returns/mean:-0.09330195188522339 - critic/returns/max:2.4748189449310303 - critic/returns/min:-2.4748547077178955 - response_length/mean:997.55078125 - response_length/max:8192.0 - response_length/min:141.0 - response_length/clip_ratio:0.00390625 - response_length_non_aborted/mean:997.55078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:141.0 - response_length_non_aborted/clip_ratio:0.00390625 - response/aborted_ratio:0.0 - prompt_length/mean:233.84375 - prompt_length/max:497.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.50902870297432e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.5831226091831923) - timing_s/agent_loop/generate_sequences/max:np.float64(27.18432815000415) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.009050946735442) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.18432815000415) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:252 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.489511877298355 - timing_s/reward:0.00012374017387628555 - timing_s/old_log_prob:8.422335583716631 - timing_s/ref:18.70205488987267 - timing_s/adv:0.08091269247233868 - timing_s/update_actor:19.805121043697 - timing_s/update_weights:25.535582792013884 - timing_s/step:101.40474317967892 - timing_s/stop_profile:5.788356065750122e-05 - timing_per_token_ms/adv:8.555752494402483e-05 - timing_per_token_ms/update_actor:0.0209420436514929 - timing_per_token_ms/gen:0.037186797191165286 - timing_per_token_ms/ref:0.019775655448517223 - perf/total_num_tokens:1338916 - perf/time_per_step:101.40474317967892 - perf/throughput:3300.9205437944274 - frontier/active_count:56.0 - frontier/completed_count:8.0 - frontier/blacklisted_count:835.0 - frontier/mean_score:2.4158484306289445 - frontier/mean_frontier_pct:0.33950787024470713 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.3317299999999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:2.3629999999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.7787436999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.539964045299999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.8319299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.1376489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.608877099999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.7275099899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:3.2363519899999993 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.4002267325699993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.8861757929999987 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.1683509999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.9717524750999993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.072594475099999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8858999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.6878699999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.8519299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9738699999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.7598999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.8119299999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.2470562999999992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.4442909999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.8623509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.023645699999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:2.9849129969509094 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.8823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.37387 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.4659 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.62613 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.2319299999999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.9423519899999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.8337025899999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.5540999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.1880699999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.0878699999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.6871394099999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.9519625899999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.6577524750999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9250575869999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.9176456999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.60613 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:118.0 - cluster/prob_snapshot/cluster_0:0.025416149992258916 - cluster/prob_snapshot/cluster_1:0.011339553636653812 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.024627032812625387 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017466504949750966 - cluster/prob_snapshot/cluster_6:0.013147877968848221 - cluster/prob_snapshot/cluster_7:0.01877456393119845 - cluster/prob_snapshot/cluster_8:0.020932678528289568 - cluster/prob_snapshot/cluster_9:0.01580078579743132 - cluster/prob_snapshot/cluster_10:0.034067276631377014 - cluster/prob_snapshot/cluster_11:0.014362005551149443 - cluster/prob_snapshot/cluster_12:0.020160840770558703 - cluster/prob_snapshot/cluster_13:0.023922030491947265 - cluster/prob_snapshot/cluster_14:0.025133337729458086 - cluster/prob_snapshot/cluster_15:0.021333645270539953 - cluster/prob_snapshot/cluster_16:0.01602772470346909 - cluster/prob_snapshot/cluster_17:0.009166404057632745 - cluster/prob_snapshot/cluster_18:0.021966199456525108 - cluster/prob_snapshot/cluster_19:0.022711589762129333 - cluster/prob_snapshot/cluster_20:0.021331606701009868 - cluster/prob_snapshot/cluster_21:0.015448580340659976 - cluster/prob_snapshot/cluster_22:0.009166404057632745 - cluster/prob_snapshot/cluster_23:0.019867835234569246 - cluster/prob_snapshot/cluster_24:0.02847217706775042 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.014590186256946655 - cluster/prob_snapshot/cluster_27:0.015965257740570084 - cluster/prob_snapshot/cluster_28:0.02040025688142941 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013504988534669355 - cluster/prob_snapshot/cluster_31:0.01339317998882872 - cluster/prob_snapshot/cluster_32:0.009166404057632745 - cluster/prob_snapshot/cluster_33:0.01660948793327088 - cluster/prob_snapshot/cluster_34:0.018067380808350294 - cluster/prob_snapshot/cluster_35:0.021966199456525108 - cluster/prob_snapshot/cluster_36:0.013765875141630832 - cluster/prob_snapshot/cluster_37:0.013951834655607065 - cluster/prob_snapshot/cluster_38:0.014958111568172767 - cluster/prob_snapshot/cluster_39:0.022063477628361854 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021305373681091682 - cluster/prob_snapshot/cluster_42:0.010155187116087332 - cluster/prob_snapshot/cluster_43:0.01700083003996074 - cluster/prob_snapshot/cluster_44:0.018227107302408344 - cluster/prob_snapshot/cluster_45:0.017466504949750966 - cluster/prob_snapshot/cluster_46:0.020932678528289568 - cluster/prob_snapshot/cluster_47:0.019411473822974826 - cluster/prob_snapshot/cluster_48:0.016497679387430245 - cluster/prob_snapshot/cluster_49:0.0217488809129262 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013554115685402525 - cluster/prob_snapshot/cluster_52:0.01887905217611466 - cluster/prob_snapshot/cluster_53:0.01617348095023343 - cluster/prob_snapshot/cluster_54:0.012565830899101417 - cluster/prob_snapshot/cluster_55:0.015432836093709924 - cluster/prob_snapshot/cluster_56:0.01986243495786538 - cluster/prob_snapshot/cluster_57:0.014428254016065898 - cluster/prob_snapshot/cluster_58:0.012253551338043194 - cluster/prob_snapshot/cluster_59:0.021621046475515073 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.021566260288053165 - cluster/prob_snapshot/cluster_63:0.011871975283513976
[36m(TaskRunner pid=2823680)[0m Training Progress:  15%|█▍        | 119/800 [3:39:10<20:20:36, 107.54s/it]
[36m(TaskRunner pid=2823680)[0m step:119 - global_seqlen/min:298058 - global_seqlen/max:419125 - global_seqlen/minmax_diff:121067 - global_seqlen/balanced_min:355865 - global_seqlen/balanced_max:355965 - global_seqlen/mean:355918.5 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.21816769936898103) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010977182537317276 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.013942805613623932) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004986235689633153) - actor/ppo_kl:np.float64(0.00011175354002072406) - actor/pg_clipfrac_lower:np.float64(2.2243576343801882e-07) - actor/grad_norm:np.float64(0.24983865891893706) - perf/mfu/actor:np.float64(0.19892968436850353) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.15531158447266) - actor/lr:np.float64(1e-06) - training/global_step:119 - training/epoch:0 - critic/score/mean:0.6223683953285217 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6132643818855286 - critic/rewards/max:1.003738284111023 - critic/rewards/min:-0.03744957596063614 - critic/advantages/mean:-0.1561727523803711 - critic/advantages/max:2.4748332500457764 - critic/advantages/min:-2.4748454093933105 - critic/returns/mean:-0.1561727523803711 - critic/returns/max:2.4748332500457764 - critic/returns/min:-2.4748454093933105 - response_length/mean:1109.071044921875 - response_length/max:8192.0 - response_length/min:186.0 - response_length/clip_ratio:0.014473684132099152 - response_length_non_aborted/mean:1109.071044921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:186.0 - response_length_non_aborted/clip_ratio:0.014473684132099152 - response/aborted_ratio:0.0 - prompt_length/mean:232.6526336669922 - prompt_length/max:478.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.622704237699509e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.469002670608461) - timing_s/agent_loop/generate_sequences/max:np.float64(28.03120596986264) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.353922699456234) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.03120596986264) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:216 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.153465966694057 - timing_s/reward:0.0002843048423528671 - timing_s/old_log_prob:10.534996674396098 - timing_s/ref:19.264127975329757 - timing_s/adv:0.09176280535757542 - timing_s/update_actor:20.936183378100395 - timing_s/update_weights:26.451909886673093 - timing_s/step:107.82957664038986 - timing_s/stop_profile:6.622914224863052e-05 - timing_per_token_ms/adv:8.99891198061953e-05 - timing_per_token_ms/update_actor:0.02053150736787949 - timing_per_token_ms/gen:0.03577373426159643 - timing_per_token_ms/ref:0.018891771165654703 - perf/total_num_tokens:1423674 - perf/time_per_step:107.82957664038986 - perf/throughput:3300.7502309592037 - frontier/active_count:56.0 - frontier/completed_count:8.0 - frontier/blacklisted_count:867.0 - frontier/mean_score:2.4216648697646406 - frontier/mean_frontier_pct:0.3490032887442993 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.2322109999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.9540999999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.5451205899999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.539964045299999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.8319299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:4.726213969999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.7275099899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:3.2363519899999993 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.8861757929999987 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.1683509999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.9717524750999993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.072594475099999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8858999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.7815089999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.8519299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9738699999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.1598999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.7598999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:1.8119299999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.2470562999999992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.4442909999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.203645699999999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.316551989999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:2.9849129969509094 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.8823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.37387 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.4659 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.62613 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:1.8623509999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.9596463929999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.8337025899999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.5540999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.4316489999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.7 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.361509 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.7809975869999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.9519625899999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.06042673257 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9475403108999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.9176456999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.60613 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:119.0 - cluster/prob_snapshot/cluster_0:0.025355104596861962 - cluster/prob_snapshot/cluster_1:0.011312317901281408 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02383403843036222 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01440936080496317 - cluster/prob_snapshot/cluster_6:0.011393582758552564 - cluster/prob_snapshot/cluster_7:0.01872947052881711 - cluster/prob_snapshot/cluster_8:0.020882401691008315 - cluster/prob_snapshot/cluster_9:0.013246157945983855 - cluster/prob_snapshot/cluster_10:0.03485068437397645 - cluster/prob_snapshot/cluster_11:0.014327510385365868 - cluster/prob_snapshot/cluster_12:0.020112417760120512 - cluster/prob_snapshot/cluster_13:0.02386457372487107 - cluster/prob_snapshot/cluster_14:0.02418760062958106 - cluster/prob_snapshot/cluster_15:0.021282405377354117 - cluster/prob_snapshot/cluster_16:0.015989228755336315 - cluster/prob_snapshot/cluster_17:0.009144387868704175 - cluster/prob_snapshot/cluster_18:0.02191344027263611 - cluster/prob_snapshot/cluster_19:0.0226570402738101 - cluster/prob_snapshot/cluster_20:0.021280371704131423 - cluster/prob_snapshot/cluster_21:0.015411475401654485 - cluster/prob_snapshot/cluster_22:0.009144387868704175 - cluster/prob_snapshot/cluster_23:0.020510601690421317 - cluster/prob_snapshot/cluster_24:0.028403791599949737 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.014555143038786477 - cluster/prob_snapshot/cluster_27:0.01592691182776723 - cluster/prob_snapshot/cluster_28:0.020351258833026895 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01347255178224299 - cluster/prob_snapshot/cluster_31:0.013361011782066893 - cluster/prob_snapshot/cluster_32:0.009144387868704175 - cluster/prob_snapshot/cluster_33:0.01656959468592475 - cluster/prob_snapshot/cluster_34:0.018023985943055234 - cluster/prob_snapshot/cluster_35:0.02191344027263611 - cluster/prob_snapshot/cluster_36:0.016249488755747207 - cluster/prob_snapshot/cluster_37:0.013918324652584425 - cluster/prob_snapshot/cluster_38:0.017082049765808006 - cluster/prob_snapshot/cluster_39:0.022010484798367324 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02125420169159531 - cluster/prob_snapshot/cluster_42:0.010130796033526816 - cluster/prob_snapshot/cluster_43:0.016959996853495366 - cluster/prob_snapshot/cluster_44:0.018183328800449663 - cluster/prob_snapshot/cluster_45:0.01742455328904763 - cluster/prob_snapshot/cluster_46:0.020882401691008315 - cluster/prob_snapshot/cluster_47:0.019364850668204255 - cluster/prob_snapshot/cluster_48:0.013732811782653887 - cluster/prob_snapshot/cluster_49:0.021824171092495175 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013521560937672305 - cluster/prob_snapshot/cluster_52:0.01883370781022283 - cluster/prob_snapshot/cluster_53:0.017930764951654413 - cluster/prob_snapshot/cluster_54:0.012535649848235705 - cluster/prob_snapshot/cluster_55:0.01741355878673956 - cluster/prob_snapshot/cluster_56:0.02050683057612965 - cluster/prob_snapshot/cluster_57:0.014393599732408984 - cluster/prob_snapshot/cluster_58:0.015193404739671712 - cluster/prob_snapshot/cluster_59:0.021734901912354238 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.021514461692006208 - cluster/prob_snapshot/cluster_63:0.011843460759262833
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 15:11:32,187:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  15%|█▌        | 120/800 [3:40:54<20:07:28, 106.54s/it]
[36m(TaskRunner pid=2823680)[0m step:120 - global_seqlen/min:306332 - global_seqlen/max:383128 - global_seqlen/minmax_diff:76796 - global_seqlen/balanced_min:355283 - global_seqlen/balanced_max:355401 - global_seqlen/mean:355325.0 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.2117689694186636) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01088633667677641 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.010981072999129537) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005834258918541452) - actor/ppo_kl:np.float64(9.603921227394226e-05) - actor/pg_clipfrac_lower:np.float64(2.122336559741776e-07) - actor/grad_norm:np.float64(0.21619296876283792) - perf/mfu/actor:np.float64(0.2047620583477417) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.05770874023438) - actor/lr:np.float64(1e-06) - training/global_step:120 - training/epoch:0 - critic/score/mean:0.5915841460227966 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5830016732215881 - critic/rewards/max:1.009002923965454 - critic/rewards/min:-0.04938037693500519 - critic/advantages/mean:-0.1406489610671997 - critic/advantages/max:2.4748497009277344 - critic/advantages/min:-2.4748494625091553 - critic/returns/mean:-0.1406489610671997 - critic/returns/max:2.4748497009277344 - critic/returns/min:-2.4748494625091553 - response_length/mean:1057.6064453125 - response_length/max:8192.0 - response_length/min:184.0 - response_length/clip_ratio:0.006188118830323219 - response_length_non_aborted/mean:1057.6064453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:184.0 - response_length_non_aborted/clip_ratio:0.006188118830323219 - response/aborted_ratio:0.0 - prompt_length/mean:231.16831970214844 - prompt_length/max:510.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.801145642995834e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.3947586100548506) - timing_s/agent_loop/generate_sequences/max:np.float64(27.220043218694627) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.27137573276741) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.220043218694627) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:316 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.842713019810617 - timing_s/reward:0.00013343431055545807 - timing_s/old_log_prob:10.114347822964191 - timing_s/ref:18.319172801449895 - timing_s/adv:0.0826216796413064 - timing_s/update_actor:21.183755291625857 - timing_s/update_weights:25.063210914842784 - timing_s/step:104.00567009020597 - timing_s/stop_profile:5.029700696468353e-05 - timing_per_token_ms/adv:7.934245593741312e-05 - timing_per_token_ms/update_actor:0.020342979931074548 - timing_per_token_ms/gen:0.03375208943674257 - timing_per_token_ms/ref:0.01759209165341428 - perf/total_num_tokens:1421300 - perf/time_per_step:104.00567009020597 - perf/throughput:3416.4002759832256 - frontier/active_count:56.0 - frontier/completed_count:8.0 - frontier/blacklisted_count:894.0 - frontier/mean_score:2.4538949275600825 - frontier/mean_frontier_pct:0.36363157293065324 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.2322109999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.9540999999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.5451205899999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.539964045299999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.8319299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:1.7963542999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.8083497789999985 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.8092569929999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.7654463929999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.8861757929999987 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.1683509999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.9717524750999993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.072594475099999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8858999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.8470562999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.5963509999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9738699999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.4119299999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.7598999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.1683509999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.2470562999999992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.4442909999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.203645699999999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.316551989999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:2.9894390978656364 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.8823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.37387 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.91 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.4659 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.5540999999999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.8823509999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.62613 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.203645699999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.9596463929999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.8337025899999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.5540999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.4316489999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.09 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.361509 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.8466983108999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:1.9519625899999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.06042673257 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:2.9423519899999997 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.60613 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:120.0 - cluster/prob_snapshot/cluster_0:0.025022084434756293 - cluster/prob_snapshot/cluster_1:0.011163739143623992 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0235209963243283 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.014220104726260115 - cluster/prob_snapshot/cluster_6:0.01124393664832957 - cluster/prob_snapshot/cluster_7:0.018483473069495566 - cluster/prob_snapshot/cluster_8:0.02060812710579694 - cluster/prob_snapshot/cluster_9:0.013072179658905727 - cluster/prob_snapshot/cluster_10:0.034990654223360965 - cluster/prob_snapshot/cluster_11:0.014139329350147589 - cluster/prob_snapshot/cluster_12:0.020443134252821543 - cluster/prob_snapshot/cluster_13:0.027401382759110793 - cluster/prob_snapshot/cluster_14:0.023869914750911596 - cluster/prob_snapshot/cluster_15:0.021002877045625517 - cluster/prob_snapshot/cluster_16:0.015779222303511 - cluster/prob_snapshot/cluster_17:0.009024283235778707 - cluster/prob_snapshot/cluster_18:0.02162562377383179 - cluster/prob_snapshot/cluster_19:0.02235945715022272 - cluster/prob_snapshot/cluster_20:0.021000870083165684 - cluster/prob_snapshot/cluster_21:0.015209057304070232 - cluster/prob_snapshot/cluster_22:0.009024283235778707 - cluster/prob_snapshot/cluster_23:0.02071820211225558 - cluster/prob_snapshot/cluster_24:0.026170865284473818 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.014363972220471343 - cluster/prob_snapshot/cluster_27:0.01755176152316082 - cluster/prob_snapshot/cluster_28:0.02008396040837485 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013295599982996426 - cluster/prob_snapshot/cluster_31:0.015779222303511 - cluster/prob_snapshot/cluster_32:0.009024283235778707 - cluster/prob_snapshot/cluster_33:0.016351965565632547 - cluster/prob_snapshot/cluster_34:0.017787254491302935 - cluster/prob_snapshot/cluster_35:0.02162562377383179 - cluster/prob_snapshot/cluster_36:0.016036063985247824 - cluster/prob_snapshot/cluster_37:0.013735517963133156 - cluster/prob_snapshot/cluster_38:0.016857689934817184 - cluster/prob_snapshot/cluster_39:0.02175433040500789 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02097504379399241 - cluster/prob_snapshot/cluster_42:0.009997735673848311 - cluster/prob_snapshot/cluster_43:0.013899186340083323 - cluster/prob_snapshot/cluster_44:0.017944504500529566 - cluster/prob_snapshot/cluster_45:0.018586341272883146 - cluster/prob_snapshot/cluster_46:0.02097504379399241 - cluster/prob_snapshot/cluster_47:0.019110507970305245 - cluster/prob_snapshot/cluster_48:0.016036063985247824 - cluster/prob_snapshot/cluster_49:0.02153752707699606 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013343965440158851 - cluster/prob_snapshot/cluster_52:0.018586341272883146 - cluster/prob_snapshot/cluster_53:0.017695257887265586 - cluster/prob_snapshot/cluster_54:0.015209057304070232 - cluster/prob_snapshot/cluster_55:0.017184844834965355 - cluster/prob_snapshot/cluster_56:0.020715597003769393 - cluster/prob_snapshot/cluster_57:0.014204550663498253 - cluster/prob_snapshot/cluster_58:0.014993850835643694 - cluster/prob_snapshot/cluster_59:0.02156395608604678 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.021411674652945016 - cluster/prob_snapshot/cluster_63:0.011687905841046089
[36m(TaskRunner pid=2823680)[0m Training Progress:  15%|█▌        | 121/800 [3:42:43<20:14:50, 107.35s/it]
[36m(TaskRunner pid=2823680)[0m step:121 - global_seqlen/min:368288 - global_seqlen/max:432956 - global_seqlen/minmax_diff:64668 - global_seqlen/balanced_min:394122 - global_seqlen/balanced_max:394250 - global_seqlen/mean:394176.0 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.24080501082870695) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010750111192464828 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.016029026432079263) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00034078752582394775) - actor/ppo_kl:np.float64(2.159046303707631e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.20710482199986777) - perf/mfu/actor:np.float64(0.23408143038549306) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.1732406616211) - actor/lr:np.float64(1e-06) - training/global_step:121 - training/epoch:0 - critic/score/mean:0.5603932738304138 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5505813956260681 - critic/rewards/max:1.0014671087265015 - critic/rewards/min:-0.04433498904109001 - critic/advantages/mean:-0.1534595638513565 - critic/advantages/max:2.4748501777648926 - critic/advantages/min:-2.474855422973633 - critic/returns/mean:-0.1534595638513565 - critic/returns/max:2.4748501777648926 - critic/returns/min:-2.474855422973633 - response_length/mean:1184.4676513671875 - response_length/max:8192.0 - response_length/min:172.0 - response_length/clip_ratio:0.007022472098469734 - response_length_non_aborted/mean:1184.4676513671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:172.0 - response_length_non_aborted/clip_ratio:0.007022472098469734 - response/aborted_ratio:0.0 - prompt_length/mean:226.5842742919922 - prompt_length/max:381.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.166449308395386e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3766816975548863) - timing_s/agent_loop/generate_sequences/max:np.float64(28.797122048214078) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.06171998985792) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.797122048214078) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.489048022776842 - timing_s/reward:0.00018958095461130142 - timing_s/old_log_prob:10.04225137643516 - timing_s/ref:20.79859051015228 - timing_s/adv:0.06561203394085169 - timing_s/update_actor:19.71437805891037 - timing_s/update_weights:27.45177387073636 - timing_s/step:109.00746181979775 - timing_s/stop_profile:6.182678043842316e-05 - timing_per_token_ms/adv:6.530711502081948e-05 - timing_per_token_ms/update_actor:0.019622759395293742 - timing_per_token_ms/gen:0.03615269271003881 - timing_per_token_ms/ref:0.020701933184115645 - perf/total_num_tokens:1576704 - perf/time_per_step:109.00746181979775 - perf/throughput:3616.0460340927816 - frontier/active_count:56.0 - frontier/completed_count:8.0 - frontier/blacklisted_count:933.0 - frontier/mean_score:2.436237340387588 - frontier/mean_frontier_pct:0.3851108741842991 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.4384876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.2322109999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.9540999999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.5451205899999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.539964045299999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.5574480099999999 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:4.865844845299999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.8664798950999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.5358124750999993 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.8861757929999987 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8178456999999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.9717524750999993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.050816132569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8858999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:1.763 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.8470562999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.5963509999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.681709 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.5883509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.1683509999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.2470562999999992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.4442909999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.203645699999999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.316551989999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:128.0 - frontier/cluster_39/score:2.9926073685059453 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.8823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.37387 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.91 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.62613 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.5540999999999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.8823509999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.62613 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.203645699999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.9596463929999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:1.8337025899999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.0878699999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.4316489999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.09 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.361509 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.8926888176299994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.2663738129999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.06042673257 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:2.9423519899999997 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.60613 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:121.0 - cluster/prob_snapshot/cluster_0:0.025203441821337495 - cluster/prob_snapshot/cluster_1:0.011244652728614923 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.023691473985143843 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.014323170521469539 - cluster/prob_snapshot/cluster_6:0.01132543149624053 - cluster/prob_snapshot/cluster_7:0.018617439301588184 - cluster/prob_snapshot/cluster_8:0.021127068663695944 - cluster/prob_snapshot/cluster_9:0.011415789071978608 - cluster/prob_snapshot/cluster_10:0.035665690317918976 - cluster/prob_snapshot/cluster_11:0.014241809694086954 - cluster/prob_snapshot/cluster_12:0.021010736571250917 - cluster/prob_snapshot/cluster_13:0.025916813373315882 - cluster/prob_snapshot/cluster_14:0.02404292133509242 - cluster/prob_snapshot/cluster_15:0.021155103647754237 - cluster/prob_snapshot/cluster_16:0.013324453171700609 - cluster/prob_snapshot/cluster_17:0.009089690273616691 - cluster/prob_snapshot/cluster_18:0.021782363977512133 - cluster/prob_snapshot/cluster_19:0.022361885111534887 - cluster/prob_snapshot/cluster_20:0.021153082139045567 - cluster/prob_snapshot/cluster_21:0.01292244492572069 - cluster/prob_snapshot/cluster_22:0.009089690273616691 - cluster/prob_snapshot/cluster_23:0.020868365420973405 - cluster/prob_snapshot/cluster_24:0.026360548911548792 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.012326597806913678 - cluster/prob_snapshot/cluster_27:0.01897210620869771 - cluster/prob_snapshot/cluster_28:0.020757492602663748 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01339196506689791 - cluster/prob_snapshot/cluster_31:0.015893588415843096 - cluster/prob_snapshot/cluster_32:0.009089690273616691 - cluster/prob_snapshot/cluster_33:0.016470482859752524 - cluster/prob_snapshot/cluster_34:0.01791617460574858 - cluster/prob_snapshot/cluster_35:0.021782363977512133 - cluster/prob_snapshot/cluster_36:0.016152291658565632 - cluster/prob_snapshot/cluster_37:0.013835071525412437 - cluster/prob_snapshot/cluster_38:0.016979872664970874 - cluster/prob_snapshot/cluster_39:0.021935226264222348 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021127068663695944 - cluster/prob_snapshot/cluster_42:0.010070198190640887 - cluster/prob_snapshot/cluster_43:0.013999926153219806 - cluster/prob_snapshot/cluster_44:0.019249018884164987 - cluster/prob_snapshot/cluster_45:0.01872105308269042 - cluster/prob_snapshot/cluster_46:0.021127068663695944 - cluster/prob_snapshot/cluster_47:0.019249018884164987 - cluster/prob_snapshot/cluster_48:0.016152291658565632 - cluster/prob_snapshot/cluster_49:0.0216936287652583 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013440681071710937 - cluster/prob_snapshot/cluster_52:0.015303678438493734 - cluster/prob_snapshot/cluster_53:0.017823511220183656 - cluster/prob_snapshot/cluster_54:0.015319290921586069 - cluster/prob_snapshot/cluster_55:0.017309398748776936 - cluster/prob_snapshot/cluster_56:0.021202842843489375 - cluster/prob_snapshot/cluster_57:0.01661207644900062 - cluster/prob_snapshot/cluster_58:0.015102524659738206 - cluster/prob_snapshot/cluster_59:0.021720249328934453 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.02156686417632426 - cluster/prob_snapshot/cluster_63:0.011772618530089492
[36m(TaskRunner pid=2823680)[0m Training Progress:  15%|█▌        | 122/800 [3:44:22<19:44:18, 104.81s/it]
[36m(TaskRunner pid=2823680)[0m step:122 - global_seqlen/min:289681 - global_seqlen/max:433989 - global_seqlen/minmax_diff:144308 - global_seqlen/balanced_min:354185 - global_seqlen/balanced_max:354543 - global_seqlen/mean:354303.75 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.23349779194184378) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011723464354872704 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04695719329538406) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004355461067675873) - actor/ppo_kl:np.float64(-1.7922375249389578e-05) - actor/pg_clipfrac_lower:np.float64(3.0898278839909016e-06) - actor/grad_norm:np.float64(0.23554902924941137) - perf/mfu/actor:np.float64(0.17758073519139045) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.76626586914062) - actor/lr:np.float64(1e-06) - training/global_step:122 - training/epoch:0 - critic/score/mean:0.5735294222831726 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5635784268379211 - critic/rewards/max:1.023026466369629 - critic/rewards/min:-0.10385668277740479 - critic/advantages/mean:-0.10828282684087753 - critic/advantages/max:2.4748330116271973 - critic/advantages/min:-2.474828004837036 - critic/returns/mean:-0.10828282684087753 - critic/returns/max:2.4748330116271973 - critic/returns/min:-2.474828004837036 - response_length/mean:1113.193603515625 - response_length/max:8192.0 - response_length/min:160.0 - response_length/clip_ratio:0.00857843179255724 - response_length_non_aborted/mean:1113.193603515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:160.0 - response_length_non_aborted/clip_ratio:0.00857843179255724 - response/aborted_ratio:0.0 - prompt_length/mean:224.8431396484375 - prompt_length/max:392.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.447002619504929e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3544320212677121) - timing_s/agent_loop/generate_sequences/max:np.float64(27.41528478451073) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.3935092625142715) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.41528478451073) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:278 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.96168544329703 - timing_s/reward:0.00018188264220952988 - timing_s/old_log_prob:10.026759046129882 - timing_s/ref:13.735357557423413 - timing_s/adv:0.10154455900192261 - timing_s/update_actor:23.25877501256764 - timing_s/update_weights:22.18754315096885 - timing_s/step:98.66689966246486 - timing_s/stop_profile:6.328709423542023e-05 - timing_per_token_ms/adv:9.300332009137126e-05 - timing_per_token_ms/update_actor:0.021302404763863907 - timing_per_token_ms/gen:0.031883277713275295 - timing_per_token_ms/ref:0.012580032529938886 - perf/total_num_tokens:1417215 - perf/time_per_step:98.66689966246486 - perf/throughput:3590.907905407564 - frontier/active_count:55.0 - frontier/completed_count:9.0 - frontier/blacklisted_count:958.0 - frontier/mean_score:2.4577069117408024 - frontier/mean_frontier_pct:0.3890756446633621 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.9069413899999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.1625476999999993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.9540999999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.9815844129999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.539964045299999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.5574480099999999 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:4.306091391709999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.8664798950999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.5358124750999993 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.8861757929999987 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8178456999999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.9802267325699994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.050816132569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8858999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:1.763 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.8470562999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.5963509999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.681709 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.7118456999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.4178456999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:1.2401 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.2470562999999992 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.4442909999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.203645699999999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.316551989999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:128.0 - frontier/cluster_39/score:2.9948251579541614 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.8823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.261709 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.91 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.6878699999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.8823509999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.62613 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:2.4425519899999992 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.9596463929999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.1835918129999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.0878699999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.4316489999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.3629999999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.361509 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9248821723409995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.2663738129999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.06042673257 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.60613 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:122.0 - cluster/prob_snapshot/cluster_0:0.0289030793951283 - cluster/prob_snapshot/cluster_1:0.011349086068595035 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.023396144999242075 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.014456195219765046 - cluster/prob_snapshot/cluster_6:0.014659521579638464 - cluster/prob_snapshot/cluster_7:0.018790346497129597 - cluster/prob_snapshot/cluster_8:0.021323283735676272 - cluster/prob_snapshot/cluster_9:0.011521811819863153 - cluster/prob_snapshot/cluster_10:0.03185594278322988 - cluster/prob_snapshot/cluster_11:0.014374078763626983 - cluster/prob_snapshot/cluster_12:0.0212058712231192 - cluster/prob_snapshot/cluster_13:0.026157512614772137 - cluster/prob_snapshot/cluster_14:0.024266217032923375 - cluster/prob_snapshot/cluster_15:0.021351579091227767 - cluster/prob_snapshot/cluster_16:0.013448202404488227 - cluster/prob_snapshot/cluster_17:0.009174109662776028 - cluster/prob_snapshot/cluster_18:0.022047356555547022 - cluster/prob_snapshot/cluster_19:0.022569568390584164 - cluster/prob_snapshot/cluster_20:0.02134953880800366 - cluster/prob_snapshot/cluster_21:0.013042460555982693 - cluster/prob_snapshot/cluster_22:0.009174109662776028 - cluster/prob_snapshot/cluster_23:0.0210621778181577 - cluster/prob_snapshot/cluster_24:0.02660536929266529 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.012441079579773737 - cluster/prob_snapshot/cluster_27:0.020061906169121537 - cluster/prob_snapshot/cluster_28:0.02095027528207831 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013516341308173384 - cluster/prob_snapshot/cluster_31:0.017886929763302528 - cluster/prob_snapshot/cluster_32:0.009174109662776028 - cluster/prob_snapshot/cluster_33:0.016623450459343395 - cluster/prob_snapshot/cluster_34:0.01808256889100595 - cluster/prob_snapshot/cluster_35:0.021984665024750813 - cluster/prob_snapshot/cluster_36:0.01630230409620582 - cluster/prob_snapshot/cluster_37:0.013963563063847039 - cluster/prob_snapshot/cluster_38:0.01713757116021452 - cluster/prob_snapshot/cluster_39:0.022155353939127504 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021323283735676272 - cluster/prob_snapshot/cluster_42:0.009333970428603726 - cluster/prob_snapshot/cluster_43:0.014129948758892197 - cluster/prob_snapshot/cluster_44:0.020257545296824955 - cluster/prob_snapshot/cluster_45:0.019884536843226996 - cluster/prob_snapshot/cluster_46:0.021323283735676272 - cluster/prob_snapshot/cluster_47:0.01942779179800501 - cluster/prob_snapshot/cluster_48:0.01806970390556553 - cluster/prob_snapshot/cluster_49:0.021895105695041942 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.016153947868076706 - cluster/prob_snapshot/cluster_52:0.015445809484412695 - cluster/prob_snapshot/cluster_53:0.017989044905555734 - cluster/prob_snapshot/cluster_54:0.017481187914796994 - cluster/prob_snapshot/cluster_55:0.017470157677310343 - cluster/prob_snapshot/cluster_56:0.021637924199463674 - cluster/prob_snapshot/cluster_57:0.016766359081772313 - cluster/prob_snapshot/cluster_58:0.015242787514484699 - cluster/prob_snapshot/cluster_59:0.021921973493954607 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.011881955288020694
[36m(TaskRunner pid=2823680)[0m Training Progress:  15%|█▌        | 123/800 [3:46:14<20:08:46, 107.13s/it]
[36m(TaskRunner pid=2823680)[0m step:123 - global_seqlen/min:330781 - global_seqlen/max:430952 - global_seqlen/minmax_diff:100171 - global_seqlen/balanced_min:389038 - global_seqlen/balanced_max:389181 - global_seqlen/mean:389115.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.2652042789074282) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010615539737045765 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.01863761624554172) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005215529345908484) - actor/ppo_kl:np.float64(1.8488766703110098e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22557081654667854) - perf/mfu/actor:np.float64(0.2119489696257785) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.56721115112305) - actor/lr:np.float64(1e-06) - training/global_step:123 - training/epoch:0 - critic/score/mean:0.58203125 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5728376507759094 - critic/rewards/max:1.021120309829712 - critic/rewards/min:-0.04716251417994499 - critic/advantages/mean:-0.1205468401312828 - critic/advantages/max:2.474811315536499 - critic/advantages/min:-2.474848747253418 - critic/returns/mean:-0.1205468401312828 - critic/returns/max:2.474811315536499 - critic/returns/min:-2.474848747253418 - response_length/mean:1184.56640625 - response_length/max:8192.0 - response_length/min:147.0 - response_length/clip_ratio:0.0078125 - response_length_non_aborted/mean:1184.56640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:147.0 - response_length_non_aborted/clip_ratio:0.0078125 - response/aborted_ratio:0.0 - prompt_length/mean:230.7604217529297 - prompt_length/max:386.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.18297266960144e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2093768045306206) - timing_s/agent_loop/generate_sequences/max:np.float64(29.04638434574008) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.996167648214396) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.04638434574008) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.919665306806564 - timing_s/reward:0.0001382380723953247 - timing_s/old_log_prob:9.598889929242432 - timing_s/ref:21.88638412859291 - timing_s/adv:0.08019044250249863 - timing_s/update_actor:21.519729915075004 - timing_s/update_weights:27.924273400567472 - timing_s/step:112.29937210865319 - timing_s/stop_profile:4.92902472615242e-05 - timing_per_token_ms/adv:7.377422442962933e-05 - timing_per_token_ms/update_actor:0.01979788781400332 - timing_per_token_ms/gen:0.033987103345003135 - timing_per_token_ms/ref:0.020135205197372247 - perf/total_num_tokens:1556462 - perf/time_per_step:112.29937210865319 - perf/throughput:3464.983754526414 - frontier/active_count:53.0 - frontier/completed_count:11.0 - frontier/blacklisted_count:990.0 - frontier/mean_score:2.4752477927261114 - frontier/mean_frontier_pct:0.4003923043892822 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.9069413899999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.1625476999999993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.9540999999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.9815844129999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:2.0779748317099993 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.390213607 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.514263974196999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.8664798950999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.320323055099999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8178456999999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.9802267325699994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:112.0 - frontier/cluster_19/score:3.6355712927989994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8858999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.1340999999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.8470562999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.417445699999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:2.0771962999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.798291989999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.8823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.4178456999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4729394099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.4442909999999998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.203645699999999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.521586392999999 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.996377610567913 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.8823509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.261709 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.91 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.6878699999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.9176456999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.62613 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:2.4425519899999992 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:2.9596463929999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.1835918129999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.0878699999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.4316489999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.3629999999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.361509 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9248821723409995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.2663738129999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.06042673257 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.60613 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:123.0 - cluster/prob_snapshot/cluster_0:0.02978121039018561 - cluster/prob_snapshot/cluster_1:0.011693893073626001 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.024106964763732378 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.014895402160988573 - cluster/prob_snapshot/cluster_6:0.015104905965703633 - cluster/prob_snapshot/cluster_7:0.015839655492929223 - cluster/prob_snapshot/cluster_8:0.02197112599873475 - cluster/prob_snapshot/cluster_9:0.010597098800441901 - cluster/prob_snapshot/cluster_10:0.03441061222891707 - cluster/prob_snapshot/cluster_11:0.014810790849393992 - cluster/prob_snapshot/cluster_12:0.021850146268820855 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.025003471254803598 - cluster/prob_snapshot/cluster_15:0.0176869889202846 - cluster/prob_snapshot/cluster_16:0.013856784590411843 - cluster/prob_snapshot/cluster_17:0.009452836712472202 - cluster/prob_snapshot/cluster_18:0.022717197539818376 - cluster/prob_snapshot/cluster_19:0.02771265364678688 - cluster/prob_snapshot/cluster_20:0.021998178750522963 - cluster/prob_snapshot/cluster_21:0.01626747748414396 - cluster/prob_snapshot/cluster_22:0.009452836712472202 - cluster/prob_snapshot/cluster_23:0.021702087182578232 - cluster/prob_snapshot/cluster_24:0.02604996062885272 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.015833721025442643 - cluster/prob_snapshot/cluster_27:0.021330374369235457 - cluster/prob_snapshot/cluster_28:0.02197112599873475 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013926993684697703 - cluster/prob_snapshot/cluster_31:0.018430369000929803 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.01885032855621913 - cluster/prob_snapshot/cluster_34:0.01863195202061559 - cluster/prob_snapshot/cluster_35:0.022652601320059196 - cluster/prob_snapshot/cluster_36:0.01679759936637489 - cluster/prob_snapshot/cluster_37:0.014387802895187235 - cluster/prob_snapshot/cluster_38:0.019221147027998348 - cluster/prob_snapshot/cluster_39:0.022840309879530767 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02197112599873475 - cluster/prob_snapshot/cluster_42:0.009617554355017007 - cluster/prob_snapshot/cluster_43:0.014559243706815505 - cluster/prob_snapshot/cluster_44:0.020873008381769385 - cluster/prob_snapshot/cluster_45:0.020488667215831508 - cluster/prob_snapshot/cluster_46:0.022240164814891262 - cluster/prob_snapshot/cluster_47:0.020018045379989215 - cluster/prob_snapshot/cluster_48:0.01861869617223936 - cluster/prob_snapshot/cluster_49:0.022560321006117514 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01664473579145241 - cluster/prob_snapshot/cluster_52:0.015915082805313552 - cluster/prob_snapshot/cluster_53:0.018535586597085974 - cluster/prob_snapshot/cluster_54:0.01801229993675656 - cluster/prob_snapshot/cluster_55:0.018000934579496426 - cluster/prob_snapshot/cluster_56:0.02229532584336783 - cluster/prob_snapshot/cluster_57:0.01727575323257157 - cluster/prob_snapshot/cluster_58:0.015705892638494347 - cluster/prob_snapshot/cluster_59:0.02258800510030002 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.012242951882108685
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▌        | 124/800 [3:47:58<19:54:17, 106.00s/it]
[36m(TaskRunner pid=2823680)[0m step:124 - global_seqlen/min:321862 - global_seqlen/max:394056 - global_seqlen/minmax_diff:72194 - global_seqlen/balanced_min:360850 - global_seqlen/balanced_max:360944 - global_seqlen/mean:360917.5 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.26962894342839716) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011446282267570496 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.07473450254465774) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007614004790229956) - actor/ppo_kl:np.float64(9.167832637558603e-05) - actor/pg_clipfrac_lower:np.float64(3.6260188790038227e-06) - actor/grad_norm:np.float64(0.24697639391972467) - perf/mfu/actor:np.float64(0.21094173367059615) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.68038558959961) - actor/lr:np.float64(1e-06) - training/global_step:124 - training/epoch:0 - critic/score/mean:0.5795454382896423 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5703870058059692 - critic/rewards/max:1.0026189088821411 - critic/rewards/min:-0.07271915674209595 - critic/advantages/mean:-0.1283947229385376 - critic/advantages/max:2.474853038787842 - critic/advantages/min:-2.474849224090576 - critic/returns/mean:-0.1283947229385376 - critic/returns/max:2.474853038787842 - critic/returns/min:-2.474849224090576 - response_length/mean:1031.33203125 - response_length/max:8192.0 - response_length/min:186.0 - response_length/clip_ratio:0.005050505045801401 - response_length_non_aborted/mean:1031.33203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:186.0 - response_length_non_aborted/clip_ratio:0.005050505045801401 - response/aborted_ratio:0.0 - prompt_length/mean:231.5858612060547 - prompt_length/max:353.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.806213736534119e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7119842255488038) - timing_s/agent_loop/generate_sequences/max:np.float64(27.981011821888387) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.716308546810978) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.981011821888387) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:232 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.245111491531134 - timing_s/reward:0.00013078376650810242 - timing_s/old_log_prob:9.609881124459207 - timing_s/ref:19.211811649613082 - timing_s/adv:0.07599104382097721 - timing_s/update_actor:20.08861193805933 - timing_s/update_weights:24.536589423194528 - timing_s/step:103.17861445154995 - timing_s/stop_profile:5.4826028645038605e-05 - timing_per_token_ms/adv:7.597349394387617e-05 - timing_per_token_ms/update_actor:0.020083972540402496 - timing_per_token_ms/gen:0.03580383745588797 - timing_per_token_ms/ref:0.019207374746046744 - perf/total_num_tokens:1443670 - perf/time_per_step:103.17861445154995 - perf/throughput:3497.9874649264425 - frontier/active_count:52.0 - frontier/completed_count:12.0 - frontier/blacklisted_count:1019.0 - frontier/mean_score:2.4287156143636883 - frontier/mean_frontier_pct:0.40663896048852105 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.9069413899999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:2.513783389999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.9540999999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.9815844129999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:1.7545823821969995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.390213607 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:4.659984781937899 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.8664798950999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.524226138569999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.1724919899999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:2.9861587127989995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8858999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.1340999999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.8929394099999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.2922119899999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:2.0771962999999998 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.798291989999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.8823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.8270562999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.4178456999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4729394099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.203645699999999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.6651104750999988 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.996377610567913 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:2.9176456999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.261709 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:1.91 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.6878699999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.9176456999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.7382909999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.0097863929999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.9717524750999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:1.8285142690999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.0878699999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.4316489999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.3629999999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.361509 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9248821723409995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.2663738129999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.06042673257 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.60613 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:124.0 - cluster/prob_snapshot/cluster_0:0.0309354820403357 - cluster/prob_snapshot/cluster_1:0.012147129496119468 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.01990434284826555 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015472723908719805 - cluster/prob_snapshot/cluster_6:0.015690347742782662 - cluster/prob_snapshot/cluster_7:0.013892927063526978 - cluster/prob_snapshot/cluster_8:0.022822691382745224 - cluster/prob_snapshot/cluster_9:0.011007825247048002 - cluster/prob_snapshot/cluster_10:0.036898141318131614 - cluster/prob_snapshot/cluster_11:0.015384833199243941 - cluster/prob_snapshot/cluster_12:0.02269702267375181 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.025972565446968254 - cluster/prob_snapshot/cluster_15:0.019987029387066937 - cluster/prob_snapshot/cluster_16:0.017201969579435684 - cluster/prob_snapshot/cluster_17:0.009819213407299235 - cluster/prob_snapshot/cluster_18:0.02364464935814802 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.022850792655531697 - cluster/prob_snapshot/cluster_21:0.016897978656977092 - cluster/prob_snapshot/cluster_22:0.009819213407299235 - cluster/prob_snapshot/cluster_23:0.02290653128068408 - cluster/prob_snapshot/cluster_24:0.02606800428342818 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.016447410497985936 - cluster/prob_snapshot/cluster_27:0.02215710525421018 - cluster/prob_snapshot/cluster_28:0.022822691382745224 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014466781482824391 - cluster/prob_snapshot/cluster_31:0.019144700358213692 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.019580936868083747 - cluster/prob_snapshot/cluster_34:0.015923292067710964 - cluster/prob_snapshot/cluster_35:0.023530579587675668 - cluster/prob_snapshot/cluster_36:0.017448647207787514 - cluster/prob_snapshot/cluster_37:0.01494545091460202 - cluster/prob_snapshot/cluster_38:0.02110256310703616 - cluster/prob_snapshot/cluster_39:0.02372556342796524 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.023102157709208093 - cluster/prob_snapshot/cluster_42:0.009990315239827522 - cluster/prob_snapshot/cluster_43:0.015123536495396772 - cluster/prob_snapshot/cluster_44:0.021682012499223306 - cluster/prob_snapshot/cluster_45:0.021282774889990634 - cluster/prob_snapshot/cluster_46:0.023102157709208093 - cluster/prob_snapshot/cluster_47:0.021682012499223306 - cluster/prob_snapshot/cluster_48:0.015913653331145198 - cluster/prob_snapshot/cluster_49:0.023530579587675668 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.014478325801616544 - cluster/prob_snapshot/cluster_52:0.016531925729133014 - cluster/prob_snapshot/cluster_53:0.019253996018583804 - cluster/prob_snapshot/cluster_54:0.018710427611844275 - cluster/prob_snapshot/cluster_55:0.018698621751679547 - cluster/prob_snapshot/cluster_56:0.023159456690122773 - cluster/prob_snapshot/cluster_57:0.017945333546134572 - cluster/prob_snapshot/cluster_58:0.016314627689064668 - cluster/prob_snapshot/cluster_59:0.023463479722691936 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.012717468937880429
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_125
[36m(WorkerDict pid=2825158)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825158)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 125}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▌        | 125/800 [3:52:44<30:01:19, 160.12s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:125 - global_seqlen/min:301748 - global_seqlen/max:392395 - global_seqlen/minmax_diff:90647 - global_seqlen/balanced_min:339700 - global_seqlen/balanced_max:339875 - global_seqlen/mean:339785.0 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.23635666712146738) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013103889301419258 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08986944882781245) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009730250380728293) - actor/ppo_kl:np.float64(0.0001469174604155994) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2574381058414777) - perf/mfu/actor:np.float64(0.2138603533997746) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.0546760559082) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.07083333333333333) - val-aux/aime2024/reward/std@16:np.float64(0.11238745542018967) - val-aux/aime2024/reward/best@2/mean:np.float64(0.11750000000000001) - val-aux/aime2024/reward/best@2/std:np.float64(0.1256232059093125) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.027299999999999998) - val-aux/aime2024/reward/worst@2/std:np.float64(0.06268353933010509) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.07339999999999999) - val-aux/aime2024/reward/maj@2/std:np.float64(0.11371180685559427) - val-aux/aime2024/reward/best@4/mean:np.float64(0.16883333333333334) - val-aux/aime2024/reward/best@4/std:np.float64(0.12449245169153644) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.008766666666666667) - val-aux/aime2024/reward/worst@4/std:np.float64(0.02174972174969131) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.08453333333333334) - val-aux/aime2024/reward/maj@4/std:np.float64(0.10725656025250577) - val-aux/aime2024/reward/best@8/mean:np.float64(0.22536666666666663) - val-aux/aime2024/reward/best@8/std:np.float64(0.10133956991965651) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0016) - val-aux/aime2024/reward/worst@8/std:np.float64(0.00712554091513994) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.0952) - val-aux/aime2024/reward/maj@8/std:np.float64(0.09863717431027524) - val-aux/aime2024/reward/best@16/mean:np.float64(0.2682333333333334) - val-aux/aime2024/reward/best@16/std:np.float64(0.06339451543748241) - val-aux/aime2024/reward/worst@16/mean:np.float64(6.666666666666667e-05) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0014892205269125785) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.10203333333333335) - val-aux/aime2024/reward/maj@16/std:np.float64(0.08582313555594388) - val-aux/aime2024/score/mean@16:np.float64(0.07083333333333333) - val-aux/aime2024/score/std@16:np.float64(0.11238745542018967) - val-aux/aime2024/score/best@2/mean:np.float64(0.11750000000000001) - val-aux/aime2024/score/best@2/std:np.float64(0.1256232059093125) - val-aux/aime2024/score/worst@2/mean:np.float64(0.027299999999999998) - val-aux/aime2024/score/worst@2/std:np.float64(0.06268353933010509) - val-aux/aime2024/score/maj@2/mean:np.float64(0.07339999999999999) - val-aux/aime2024/score/maj@2/std:np.float64(0.11371180685559427) - val-aux/aime2024/score/best@4/mean:np.float64(0.16883333333333334) - val-aux/aime2024/score/best@4/std:np.float64(0.12449245169153644) - val-aux/aime2024/score/worst@4/mean:np.float64(0.008766666666666667) - val-aux/aime2024/score/worst@4/std:np.float64(0.02174972174969131) - val-aux/aime2024/score/maj@4/mean:np.float64(0.08453333333333334) - val-aux/aime2024/score/maj@4/std:np.float64(0.10725656025250577) - val-aux/aime2024/score/best@8/mean:np.float64(0.22536666666666663) - val-aux/aime2024/score/best@8/std:np.float64(0.10133956991965651) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0016) - val-aux/aime2024/score/worst@8/std:np.float64(0.00712554091513994) - val-aux/aime2024/score/maj@8/mean:np.float64(0.0952) - val-aux/aime2024/score/maj@8/std:np.float64(0.09863717431027524) - val-aux/aime2024/score/best@16/mean:np.float64(0.2682333333333334) - val-aux/aime2024/score/best@16/std:np.float64(0.06339451543748241) - val-aux/aime2024/score/worst@16/mean:np.float64(6.666666666666667e-05) - val-aux/aime2024/score/worst@16/std:np.float64(0.0014892205269125785) - val-aux/aime2024/score/maj@16/mean:np.float64(0.10203333333333335) - val-aux/aime2024/score/maj@16/std:np.float64(0.08582313555594388) - val-core/aime2024/acc/mean@16:np.float64(0.07083333333333333) - val-aux/aime2024/acc/std@16:np.float64(0.11238745542018967) - val-aux/aime2024/acc/best@2/mean:np.float64(0.11750000000000001) - val-aux/aime2024/acc/best@2/std:np.float64(0.1256232059093125) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.027299999999999998) - val-aux/aime2024/acc/worst@2/std:np.float64(0.06268353933010509) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.07339999999999999) - val-aux/aime2024/acc/maj@2/std:np.float64(0.11371180685559427) - val-aux/aime2024/acc/best@4/mean:np.float64(0.16883333333333334) - val-aux/aime2024/acc/best@4/std:np.float64(0.12449245169153644) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.008766666666666667) - val-aux/aime2024/acc/worst@4/std:np.float64(0.02174972174969131) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.08453333333333334) - val-aux/aime2024/acc/maj@4/std:np.float64(0.10725656025250577) - val-aux/aime2024/acc/best@8/mean:np.float64(0.22536666666666663) - val-aux/aime2024/acc/best@8/std:np.float64(0.10133956991965651) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0016) - val-aux/aime2024/acc/worst@8/std:np.float64(0.00712554091513994) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.0952) - val-aux/aime2024/acc/maj@8/std:np.float64(0.09863717431027524) - val-core/aime2024/acc/best@16/mean:np.float64(0.2682333333333334) - val-core/aime2024/acc/best@16/std:np.float64(0.06339451543748241) - val-aux/aime2024/acc/worst@16/mean:np.float64(6.666666666666667e-05) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0014892205269125785) - val-core/aime2024/acc/maj@16/mean:np.float64(0.10203333333333335) - val-core/aime2024/acc/maj@16/std:np.float64(0.08582313555594388) - val-aux/aime2025/reward/mean@16:np.float64(0.05416666666666667) - val-aux/aime2025/reward/std@16:np.float64(0.10597310395080038) - val-aux/aime2025/reward/best@2/mean:np.float64(0.09233333333333334) - val-aux/aime2025/reward/best@2/std:np.float64(0.12563933236761077) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0145) - val-aux/aime2025/reward/worst@2/std:np.float64(0.05137396565126583) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.052333333333333336) - val-aux/aime2025/reward/maj@2/std:np.float64(0.10482706652240621) - val-aux/aime2025/reward/best@4/mean:np.float64(0.14653333333333332) - val-aux/aime2025/reward/best@4/std:np.float64(0.13172130282570024) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0015) - val-aux/aime2025/reward/worst@4/std:np.float64(0.012316835077877458) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.0605) - val-aux/aime2025/reward/maj@4/std:np.float64(0.10716277640706794) - val-aux/aime2025/reward/best@8/mean:np.float64(0.20579999999999998) - val-aux/aime2025/reward/best@8/std:np.float64(0.11483796863813321) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.0679) - val-aux/aime2025/reward/maj@8/std:np.float64(0.10733264096797507) - val-aux/aime2025/reward/best@16/mean:np.float64(0.25473333333333337) - val-aux/aime2025/reward/best@16/std:np.float64(0.08304916806715278) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.07559999999999999) - val-aux/aime2025/reward/maj@16/std:np.float64(0.10349380373269321) - val-aux/aime2025/score/mean@16:np.float64(0.05416666666666667) - val-aux/aime2025/score/std@16:np.float64(0.10597310395080038) - val-aux/aime2025/score/best@2/mean:np.float64(0.09233333333333334) - val-aux/aime2025/score/best@2/std:np.float64(0.12563933236761077) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0145) - val-aux/aime2025/score/worst@2/std:np.float64(0.05137396565126583) - val-aux/aime2025/score/maj@2/mean:np.float64(0.052333333333333336) - val-aux/aime2025/score/maj@2/std:np.float64(0.10482706652240621) - val-aux/aime2025/score/best@4/mean:np.float64(0.14653333333333332) - val-aux/aime2025/score/best@4/std:np.float64(0.13172130282570024) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0015) - val-aux/aime2025/score/worst@4/std:np.float64(0.012316835077877458) - val-aux/aime2025/score/maj@4/mean:np.float64(0.0605) - val-aux/aime2025/score/maj@4/std:np.float64(0.10716277640706794) - val-aux/aime2025/score/best@8/mean:np.float64(0.20579999999999998) - val-aux/aime2025/score/best@8/std:np.float64(0.11483796863813321) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0) - val-aux/aime2025/score/worst@8/std:np.float64(0.0) - val-aux/aime2025/score/maj@8/mean:np.float64(0.0679) - val-aux/aime2025/score/maj@8/std:np.float64(0.10733264096797507) - val-aux/aime2025/score/best@16/mean:np.float64(0.25473333333333337) - val-aux/aime2025/score/best@16/std:np.float64(0.08304916806715278) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.07559999999999999) - val-aux/aime2025/score/maj@16/std:np.float64(0.10349380373269321) - val-core/aime2025/acc/mean@16:np.float64(0.05416666666666667) - val-aux/aime2025/acc/std@16:np.float64(0.10597310395080038) - val-aux/aime2025/acc/best@2/mean:np.float64(0.09233333333333334) - val-aux/aime2025/acc/best@2/std:np.float64(0.12563933236761077) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0145) - val-aux/aime2025/acc/worst@2/std:np.float64(0.05137396565126583) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.052333333333333336) - val-aux/aime2025/acc/maj@2/std:np.float64(0.10482706652240621) - val-aux/aime2025/acc/best@4/mean:np.float64(0.14653333333333332) - val-aux/aime2025/acc/best@4/std:np.float64(0.13172130282570024) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0015) - val-aux/aime2025/acc/worst@4/std:np.float64(0.012316835077877458) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.0605) - val-aux/aime2025/acc/maj@4/std:np.float64(0.10716277640706794) - val-aux/aime2025/acc/best@8/mean:np.float64(0.20579999999999998) - val-aux/aime2025/acc/best@8/std:np.float64(0.11483796863813321) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.0679) - val-aux/aime2025/acc/maj@8/std:np.float64(0.10733264096797507) - val-core/aime2025/acc/best@16/mean:np.float64(0.25473333333333337) - val-core/aime2025/acc/best@16/std:np.float64(0.08304916806715278) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.07559999999999999) - val-core/aime2025/acc/maj@16/std:np.float64(0.10349380373269321) - val-aux/math500/reward/mean@4:np.float64(0.683) - val-aux/math500/reward/std@4:np.float64(0.14372689603142602) - val-aux/math500/reward/best@2/mean:np.float64(0.74801) - val-aux/math500/reward/best@2/std:np.float64(0.11614848531534781) - val-aux/math500/reward/worst@2/mean:np.float64(0.61815) - val-aux/math500/reward/worst@2/std:np.float64(0.13038516044055096) - val-aux/math500/reward/maj@2/mean:np.float64(0.6831479999999999) - val-aux/math500/reward/maj@2/std:np.float64(0.1436565880670646) - val-aux/math500/reward/best@4/mean:np.float64(0.793846) - val-aux/math500/reward/best@4/std:np.float64(0.07035321903943935) - val-aux/math500/reward/worst@4/mean:np.float64(0.562174) - val-aux/math500/reward/worst@4/std:np.float64(0.09346294975476013) - val-aux/math500/reward/maj@4/mean:np.float64(0.69826) - val-aux/math500/reward/maj@4/std:np.float64(0.13077763050063362) - val-aux/math500/score/mean@4:np.float64(0.683) - val-aux/math500/score/std@4:np.float64(0.14372689603142602) - val-aux/math500/score/best@2/mean:np.float64(0.74801) - val-aux/math500/score/best@2/std:np.float64(0.11614848531534781) - val-aux/math500/score/worst@2/mean:np.float64(0.61815) - val-aux/math500/score/worst@2/std:np.float64(0.13038516044055096) - val-aux/math500/score/maj@2/mean:np.float64(0.6831479999999999) - val-aux/math500/score/maj@2/std:np.float64(0.1436565880670646) - val-aux/math500/score/best@4/mean:np.float64(0.793846) - val-aux/math500/score/best@4/std:np.float64(0.07035321903943935) - val-aux/math500/score/worst@4/mean:np.float64(0.562174) - val-aux/math500/score/worst@4/std:np.float64(0.09346294975476013) - val-aux/math500/score/maj@4/mean:np.float64(0.69826) - val-aux/math500/score/maj@4/std:np.float64(0.13077763050063362) - val-core/math500/acc/mean@4:np.float64(0.683) - val-aux/math500/acc/std@4:np.float64(0.14372689603142602) - val-aux/math500/acc/best@2/mean:np.float64(0.74801) - val-aux/math500/acc/best@2/std:np.float64(0.11614848531534781) - val-aux/math500/acc/worst@2/mean:np.float64(0.61815) - val-aux/math500/acc/worst@2/std:np.float64(0.13038516044055096) - val-aux/math500/acc/maj@2/mean:np.float64(0.6831479999999999) - val-aux/math500/acc/maj@2/std:np.float64(0.1436565880670646) - val-core/math500/acc/best@4/mean:np.float64(0.793846) - val-core/math500/acc/best@4/std:np.float64(0.07035321903943935) - val-aux/math500/acc/worst@4/mean:np.float64(0.562174) - val-aux/math500/acc/worst@4/std:np.float64(0.09346294975476013) - val-core/math500/acc/maj@4/mean:np.float64(0.69826) - val-core/math500/acc/maj@4/std:np.float64(0.13077763050063362) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.0679054054054054 - val-aux/aime2024/response_length/clip_ratio:0.16041666666666668 - val-aux/aime2025/response_length/clip_ratio:0.11666666666666667 - val-aux/math500/response_length/clip_ratio:0.034 - training/global_step:125 - training/epoch:0 - critic/score/mean:0.604619562625885 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5956308841705322 - critic/rewards/max:1.002161979675293 - critic/rewards/min:-0.05592293292284012 - critic/advantages/mean:-0.10909711569547653 - critic/advantages/max:2.474817991256714 - critic/advantages/min:-2.4748353958129883 - critic/returns/mean:-0.10909711569547653 - critic/returns/max:2.474817991256714 - critic/returns/min:-2.4748353958129883 - response_length/mean:953.5638427734375 - response_length/max:7972.0 - response_length/min:141.0 - response_length/clip_ratio:0.0 - response_length_non_aborted/mean:953.5638427734375 - response_length_non_aborted/max:7972.0 - response_length_non_aborted/min:141.0 - response_length_non_aborted/clip_ratio:0.0 - response/aborted_ratio:0.0 - prompt_length/mean:230.71739196777344 - prompt_length/max:434.0 - prompt_length/min:169.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.55475664138794e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.233765659853816) - timing_s/agent_loop/generate_sequences/max:np.float64(26.622658664360642) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.96670926983461) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(26.622658664360642) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:241 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.797731813043356 - timing_s/reward:0.000287545844912529 - timing_s/old_log_prob:8.767248380929232 - timing_s/ref:17.89278909098357 - timing_s/adv:0.07778067793697119 - timing_s/update_actor:18.52476758416742 - timing_s/save_checkpoint:53.32398335915059 - timing_s/update_weights:24.539567100815475 - timing_s/step:152.32733893673867 - timing_s/testing:133.73897640500218 - timing_s/stop_profile:0.00039525143802165985 - timing_per_token_ms/adv:8.923578663100691e-05 - timing_per_token_ms/update_actor:0.021252993048856018 - timing_per_token_ms/gen:0.04103275585588297 - timing_per_token_ms/ref:0.020527940253368192 - perf/total_num_tokens:1359140 - perf/time_per_step:152.32733893673867 - perf/throughput:2230.6238812529396 - frontier/active_count:51.0 - frontier/completed_count:13.0 - frontier/blacklisted_count:1055.0 - frontier/mean_score:2.4095225764639356 - frontier/mean_frontier_pct:0.4171577360016755 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.9069413899999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.5340999999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.9540999999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.9815844129999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:1.7545823821969995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.9176456999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.390213607 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.761989347356529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:2.306535926569999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.524226138569999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.1724919899999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.2401 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:2.9903110989592996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:2.9201299999999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.1340999999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.8929394099999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.2922119899999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:2.3540374099999997 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.798291989999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.9176456999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.4178456999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4729394099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.8425519899999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.6651104750999988 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.996377610567913 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.9423519899999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.261709 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.237 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.1815089999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.9176456999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.7382909999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.0097863929999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.9717524750999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.0878699999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.4316489999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.3629999999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.5530562999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9248821723409995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.2663738129999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9632782176299997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.60613 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:125.0 - cluster/prob_snapshot/cluster_0:0.031793308213775615 - cluster/prob_snapshot/cluster_1:0.01248396360784751 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.01676069052440801 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015901775168564513 - cluster/prob_snapshot/cluster_6:0.01612543360782861 - cluster/prob_snapshot/cluster_7:0.014278171309769725 - cluster/prob_snapshot/cluster_8:0.02374276953222917 - cluster/prob_snapshot/cluster_9:0.0113130669949302 - cluster/prob_snapshot/cluster_10:0.03875138629382467 - cluster/prob_snapshot/cluster_11:0.015811447291602707 - cluster/prob_snapshot/cluster_12:0.018769774178666782 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.026692772308550057 - cluster/prob_snapshot/cluster_15:0.02054126018635377 - cluster/prob_snapshot/cluster_16:0.017678971997588302 - cluster/prob_snapshot/cluster_17:0.010091495515345609 - cluster/prob_snapshot/cluster_18:0.024334094867055858 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.02376298588761081 - cluster/prob_snapshot/cluster_21:0.01736655155172894 - cluster/prob_snapshot/cluster_22:0.010091495515345609 - cluster/prob_snapshot/cluster_23:0.023541718476075772 - cluster/prob_snapshot/cluster_24:0.02679085761845983 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.01915632446251979 - cluster/prob_snapshot/cluster_27:0.022771511223056633 - cluster/prob_snapshot/cluster_28:0.02374276953222917 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012848850878975436 - cluster/prob_snapshot/cluster_31:0.01967557377497594 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.020123906915358772 - cluster/prob_snapshot/cluster_34:0.016364837367868257 - cluster/prob_snapshot/cluster_35:0.02418307134520511 - cluster/prob_snapshot/cluster_36:0.0149940368872479 - cluster/prob_snapshot/cluster_37:0.01535988114561283 - cluster/prob_snapshot/cluster_38:0.021687727124725623 - cluster/prob_snapshot/cluster_39:0.02438346199445858 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02394382058838257 - cluster/prob_snapshot/cluster_42:0.010267341920144498 - cluster/prob_snapshot/cluster_43:0.01820391538410461 - cluster/prob_snapshot/cluster_44:0.022283244372398384 - cluster/prob_snapshot/cluster_45:0.01775234923811473 - cluster/prob_snapshot/cluster_46:0.02374276953222917 - cluster/prob_snapshot/cluster_47:0.022283244372398384 - cluster/prob_snapshot/cluster_48:0.01635493135373125 - cluster/prob_snapshot/cluster_49:0.02418307134520511 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.012857155985050684 - cluster/prob_snapshot/cluster_52:0.016990348150652878 - cluster/prob_snapshot/cluster_53:0.019787900151918905 - cluster/prob_snapshot/cluster_54:0.019229258852319707 - cluster/prob_snapshot/cluster_55:0.020775869850717565 - cluster/prob_snapshot/cluster_56:0.0238016573865765 - cluster/prob_snapshot/cluster_57:0.018442949092803987 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0241141108329445 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.013070118290510479
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▌        | 126/800 [3:54:36<27:15:51, 145.63s/it]
[36m(TaskRunner pid=2823680)[0m step:126 - global_seqlen/min:357502 - global_seqlen/max:437379 - global_seqlen/minmax_diff:79877 - global_seqlen/balanced_min:382271 - global_seqlen/balanced_max:382394 - global_seqlen/mean:382330.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.2500042397296056) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010932703502476215 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.017997493923758157) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011137697484476423) - actor/ppo_kl:np.float64(0.0001565108097919913) - actor/pg_clipfrac_lower:np.float64(2.613726186003381e-06) - actor/grad_norm:np.float64(0.23385134215156236) - perf/mfu/actor:np.float64(0.2122329774292485) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.78142929077148) - actor/lr:np.float64(1e-06) - training/global_step:126 - training/epoch:0 - critic/score/mean:0.5052083134651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.49601849913597107 - critic/rewards/max:1.004349708557129 - critic/rewards/min:-0.0549764409661293 - critic/advantages/mean:-0.1385778784751892 - critic/advantages/max:2.4748454093933105 - critic/advantages/min:-2.4748544692993164 - critic/returns/mean:-0.1385778784751892 - critic/returns/max:2.4748454093933105 - critic/returns/min:-2.4748544692993164 - response_length/mean:1179.9947509765625 - response_length/max:8192.0 - response_length/min:229.0 - response_length/clip_ratio:0.009114583022892475 - response_length_non_aborted/mean:1179.9947509765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:229.0 - response_length_non_aborted/clip_ratio:0.009114583022892475 - response/aborted_ratio:0.0 - prompt_length/mean:240.6770782470703 - prompt_length/max:522.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.293241262435913e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6969246435910463) - timing_s/agent_loop/generate_sequences/max:np.float64(28.143533286638558) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.7700605059326335) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.143533286638558) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:309 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.126074148342013 - timing_s/reward:0.00020030979067087173 - timing_s/old_log_prob:9.797356330789626 - timing_s/ref:20.85820194054395 - timing_s/adv:0.07713313866406679 - timing_s/update_actor:21.174131699837744 - timing_s/update_weights:28.137212836183608 - timing_s/step:111.58383156079799 - timing_s/stop_profile:5.6634657084941864e-05 - timing_per_token_ms/adv:7.069456084091924e-05 - timing_per_token_ms/update_actor:0.019406651507170667 - timing_per_token_ms/gen:0.03434654344822101 - timing_per_token_ms/ref:0.019117093530188502 - perf/total_num_tokens:1529321 - perf/time_per_step:111.58383156079799 - perf/throughput:3426.394708373875 - frontier/active_count:50.0 - frontier/completed_count:14.0 - frontier/blacklisted_count:1087.0 - frontier/mean_score:2.4594978636808342 - frontier/mean_frontier_pct:0.434067295364861 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.9069413899999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9738699999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.6678699999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:1.7545823821969995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.9176456999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.390213607 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.761989347356529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:2.306535926569999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.524226138569999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.4207443929999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:2.9903110989592996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:2.9440909999999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:2.3938699999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.9250575869999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.2922119899999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:2.5478261869999996 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.798291989999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.9176456999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.4178456999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4729394099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:2.9717524750999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:1.5897863929999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.6651104750999988 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.996377610567913 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.9423519899999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.261709 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.4659 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.1815089999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.9176456999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8168036999999995 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.0097863929999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.9802267325699994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.0878699999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.4316489999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.3629999999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.5530562999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9248821723409995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.4864616690999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.574294752341 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.424291 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:126.0 - cluster/prob_snapshot/cluster_0:0.03177023609325646 - cluster/prob_snapshot/cluster_1:0.016050999914640676 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.016748527440618072 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01356268712105242 - cluster/prob_snapshot/cluster_6:0.013719134413681552 - cluster/prob_snapshot/cluster_7:0.014267809768056697 - cluster/prob_snapshot/cluster_8:0.0237255396159077 - cluster/prob_snapshot/cluster_9:0.011304857203001874 - cluster/prob_snapshot/cluster_10:0.038723264758033435 - cluster/prob_snapshot/cluster_11:0.01579997306516986 - cluster/prob_snapshot/cluster_12:0.01875615312076819 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.026673401601495846 - cluster/prob_snapshot/cluster_15:0.02052635357684185 - cluster/prob_snapshot/cluster_16:0.01968486680754553 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.024316435831207114 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.023940585950287696 - cluster/prob_snapshot/cluster_21:0.019466331199957887 - cluster/prob_snapshot/cluster_22:0.010084172206956843 - cluster/prob_snapshot/cluster_23:0.023785811162465642 - cluster/prob_snapshot/cluster_24:0.02677141573177008 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.020718263061932283 - cluster/prob_snapshot/cluster_27:0.022754986140237035 - cluster/prob_snapshot/cluster_28:0.0237255396159077 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012839526582364995 - cluster/prob_snapshot/cluster_31:0.019661295386380218 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.020109303175397344 - cluster/prob_snapshot/cluster_34:0.016352961551187303 - cluster/prob_snapshot/cluster_35:0.02416552190578068 - cluster/prob_snapshot/cluster_36:0.01292773144043929 - cluster/prob_snapshot/cluster_37:0.015348734616709055 - cluster/prob_snapshot/cluster_38:0.02167198853437058 - cluster/prob_snapshot/cluster_39:0.0243657671333253 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.023926444771100844 - cluster/prob_snapshot/cluster_42:0.010259891001586413 - cluster/prob_snapshot/cluster_43:0.02005206051538979 - cluster/prob_snapshot/cluster_44:0.02226707362048226 - cluster/prob_snapshot/cluster_45:0.0177394665164311 - cluster/prob_snapshot/cluster_46:0.0237255396159077 - cluster/prob_snapshot/cluster_47:0.022905518574303035 - cluster/prob_snapshot/cluster_48:0.0163430627257565 - cluster/prob_snapshot/cluster_49:0.02423443237401193 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.012847825661498757 - cluster/prob_snapshot/cluster_52:0.016978018406369633 - cluster/prob_snapshot/cluster_53:0.01977354024907217 - cluster/prob_snapshot/cluster_54:0.01921530435048707 - cluster/prob_snapshot/cluster_55:0.02076079298706239 - cluster/prob_snapshot/cluster_56:0.02378438473586377 - cluster/prob_snapshot/cluster_57:0.020219262686236382 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02906523973956036 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.011581965742132706
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▌        | 127/800 [3:56:23<25:02:30, 133.95s/it]
[36m(TaskRunner pid=2823680)[0m step:127 - global_seqlen/min:334569 - global_seqlen/max:419130 - global_seqlen/minmax_diff:84561 - global_seqlen/balanced_min:365067 - global_seqlen/balanced_max:365200 - global_seqlen/mean:365120.75 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.2783855226589367) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010871765203773975 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.056113365784767666) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007398168958161477) - actor/ppo_kl:np.float64(9.329587706445135e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23601543654998144) - perf/mfu/actor:np.float64(0.2049690034016859) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.75550270080566) - actor/lr:np.float64(1e-06) - training/global_step:127 - training/epoch:0 - critic/score/mean:0.5644736886024475 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5555456876754761 - critic/rewards/max:1.0013258457183838 - critic/rewards/min:-0.0472899004817009 - critic/advantages/mean:-0.12111325562000275 - critic/advantages/max:2.4748482704162598 - critic/advantages/min:-2.4748544692993164 - critic/returns/mean:-0.12111325562000275 - critic/returns/max:2.4748482704162598 - critic/returns/min:-2.4748544692993164 - response_length/mean:1056.8565673828125 - response_length/max:8192.0 - response_length/min:146.0 - response_length/clip_ratio:0.00657894741743803 - response_length_non_aborted/mean:1056.8565673828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:146.0 - response_length_non_aborted/clip_ratio:0.00657894741743803 - response/aborted_ratio:0.0 - prompt_length/mean:240.8631591796875 - prompt_length/max:434.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.479319512844086e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3509124033153057) - timing_s/agent_loop/generate_sequences/max:np.float64(28.270678219385445) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.7826861703752) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.270678219385445) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.134959554299712 - timing_s/reward:0.00011949241161346436 - timing_s/old_log_prob:8.692660133354366 - timing_s/ref:19.699628066271544 - timing_s/adv:0.11907425336539745 - timing_s/update_actor:20.82006833422929 - timing_s/update_weights:26.29193777870387 - timing_s/step:106.50857905671 - timing_s/stop_profile:6.737746298313141e-05 - timing_per_token_ms/adv:0.00012073226962414585 - timing_per_token_ms/update_actor:0.021109971573853015 - timing_per_token_ms/gen:0.03751811112434928 - timing_per_token_ms/ref:0.019973930047615447 - perf/total_num_tokens:1460483 - perf/time_per_step:106.50857905671 - perf/throughput:3428.0877017953 - frontier/active_count:49.0 - frontier/completed_count:15.0 - frontier/blacklisted_count:1120.0 - frontier/mean_score:2.495414756406109 - frontier/mean_frontier_pct:0.4548047235828742 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.6348589729999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9738699999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.6678699999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:1.7545823821969995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:2.9423519899999997 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.390213607 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.761989347356529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.9145751485989995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.280158712798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.666958296998999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.4207443929999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:2.9903110989592996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.3608636999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.9757089999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.9250575869999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.2922119899999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:2.5478261869999996 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.798291989999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9423519899999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.592491989999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4729394099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.5802267325699995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.0128504750999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.6651104750999988 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.9974643273975388 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.9423519899999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.62613 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.1815089999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9423519899999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8168036999999995 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.0097863929999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:2.9802267325699994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.0878699999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.6021542999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.3629999999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.5530562999999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9248821723409995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.4864616690999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.574294752341 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.8970037 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:127.0 - cluster/prob_snapshot/cluster_0:0.029726840068134384 - cluster/prob_snapshot/cluster_1:0.016142832016632525 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.016844350285818963 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013640282914062673 - cluster/prob_snapshot/cluster_6:0.013797625283871386 - cluster/prob_snapshot/cluster_7:0.014349439758012984 - cluster/prob_snapshot/cluster_8:0.024063334418363124 - cluster/prob_snapshot/cluster_9:0.011369535341758975 - cluster/prob_snapshot/cluster_10:0.03894481100547149 - cluster/prob_snapshot/cluster_11:0.01589036897481445 - cluster/prob_snapshot/cluster_12:0.015657903006303812 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02682600733007129 - cluster/prob_snapshot/cluster_15:0.02181109180635969 - cluster/prob_snapshot/cluster_16:0.01979748924265735 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.024455556722566215 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.01930776906445993 - cluster/prob_snapshot/cluster_21:0.016157871846043066 - cluster/prob_snapshot/cluster_22:0.010141866477440762 - cluster/prob_snapshot/cluster_23:0.02392189620690191 - cluster/prob_snapshot/cluster_24:0.026924582225634652 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.02083679783588502 - cluster/prob_snapshot/cluster_27:0.022885173556545433 - cluster/prob_snapshot/cluster_28:0.024063334418363124 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012912984978783236 - cluster/prob_snapshot/cluster_31:0.02120208661109159 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.020224353925506918 - cluster/prob_snapshot/cluster_34:0.0164465212571884 - cluster/prob_snapshot/cluster_35:0.029280043126109948 - cluster/prob_snapshot/cluster_36:0.01646162467342746 - cluster/prob_snapshot/cluster_37:0.015436548869419989 - cluster/prob_snapshot/cluster_38:0.021795979829120956 - cluster/prob_snapshot/cluster_39:0.02451405772063351 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.024063334418363124 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.021477187172326027 - cluster/prob_snapshot/cluster_44:0.022394469557598367 - cluster/prob_snapshot/cluster_45:0.017840958791496903 - cluster/prob_snapshot/cluster_46:0.024063334418363124 - cluster/prob_snapshot/cluster_47:0.023036567227289008 - cluster/prob_snapshot/cluster_48:0.016436565797906037 - cluster/prob_snapshot/cluster_49:0.02437308410146318 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.012921331539188287 - cluster/prob_snapshot/cluster_52:0.01707515423131541 - cluster/prob_snapshot/cluster_53:0.02128110754318065 - cluster/prob_snapshot/cluster_54:0.019325240292067183 - cluster/prob_snapshot/cluster_55:0.02087957108619381 - cluster/prob_snapshot/cluster_56:0.023920461619328516 - cluster/prob_snapshot/cluster_57:0.020334942544380848 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.029231529819578755 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.015514199042505516
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▌        | 128/800 [3:58:05<23:12:58, 124.37s/it]
[36m(TaskRunner pid=2823680)[0m step:128 - global_seqlen/min:289680 - global_seqlen/max:427692 - global_seqlen/minmax_diff:138012 - global_seqlen/balanced_min:351491 - global_seqlen/balanced_max:351568 - global_seqlen/mean:351524.5 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.22424208276117077) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012230911292135715 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03924134913540911) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004210455238605321) - actor/ppo_kl:np.float64(9.926652866563648e-05) - actor/pg_clipfrac_lower:np.float64(1.6618362748816323e-06) - actor/grad_norm:np.float64(0.2535645973223906) - perf/mfu/actor:np.float64(0.1497112375773885) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.58604049682617) - actor/lr:np.float64(1e-06) - training/global_step:128 - training/epoch:0 - critic/score/mean:0.604567289352417 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5957440733909607 - critic/rewards/max:1.0006041526794434 - critic/rewards/min:-0.04663350060582161 - critic/advantages/mean:-0.16477815806865692 - critic/advantages/max:2.474820613861084 - critic/advantages/min:-2.4748544692993164 - critic/returns/mean:-0.16477815806865692 - critic/returns/max:2.474820613861084 - critic/returns/min:-2.4748544692993164 - response_length/mean:1056.717529296875 - response_length/max:8192.0 - response_length/min:199.0 - response_length/clip_ratio:0.012019230984151363 - response_length_non_aborted/mean:1056.717529296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:199.0 - response_length_non_aborted/clip_ratio:0.012019230984151363 - response/aborted_ratio:0.0 - prompt_length/mean:241.71153259277344 - prompt_length/max:879.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.591450750827789e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5244181668385863) - timing_s/agent_loop/generate_sequences/max:np.float64(27.422550613060594) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.915621630413625) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.422550613060594) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.747041928581893 - timing_s/reward:0.0001632971689105034 - timing_s/old_log_prob:10.951156650669873 - timing_s/ref:11.197653505019844 - timing_s/adv:0.08361456450074911 - timing_s/update_actor:27.581119661219418 - timing_s/update_weights:22.693040408194065 - timing_s/step:101.76363811735064 - timing_s/stop_profile:5.1596201956272125e-05 - timing_per_token_ms/adv:7.739989475146938e-05 - timing_per_token_ms/update_actor:0.025531147254697954 - timing_per_token_ms/gen:0.03269722656741826 - timing_per_token_ms/ref:0.010365385599110467 - perf/total_num_tokens:1406098 - perf/time_per_step:101.76363811735064 - perf/throughput:3454.3232386663785 - frontier/active_count:47.0 - frontier/completed_count:17.0 - frontier/blacklisted_count:1144.0 - frontier/mean_score:2.543546892819103 - frontier/mean_frontier_pct:0.4716786496284264 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:7.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:4.044401281099999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9738699999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.6678699999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:1.7545823821969995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.3596463929999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.390213607 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.761989347356529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:1.9429999999999998 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.9145751485989995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.196111098959299 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.166870807899299 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.4207443929999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9932177692715096 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.3608636999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.9757089999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.9250575869999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.2922119899999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:2.5478261869999996 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.798291989999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9423519899999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.7147443929999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4729394099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.4061587127989994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.2212562999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.7655773325699986 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.9974643273975388 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.9423519899999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.62613 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.1815089999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9423519899999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.8717625899999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.0097863929999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.5861587127989996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.7615089999999998 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.6021542999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.3629999999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.6871394099999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9248821723409995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.4864616690999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:4.0020063266387 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.8970037 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:128.0 - cluster/prob_snapshot/cluster_0:0.033831140023473504 - cluster/prob_snapshot/cluster_1:0.01651128751001959 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.01722881773173869 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013951618444647504 - cluster/prob_snapshot/cluster_6:0.014112552108749487 - cluster/prob_snapshot/cluster_7:0.014676961589400383 - cluster/prob_snapshot/cluster_8:0.019738280645029138 - cluster/prob_snapshot/cluster_9:0.011629041712736087 - cluster/prob_snapshot/cluster_10:0.03983371510477093 - cluster/prob_snapshot/cluster_11:0.01625306207195411 - cluster/prob_snapshot/cluster_12:0.016015290134637317 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.026735250684635593 - cluster/prob_snapshot/cluster_15:0.018125726064175327 - cluster/prob_snapshot/cluster_16:0.020249361235081767 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.025038061862504304 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.019748463334803523 - cluster/prob_snapshot/cluster_21:0.01652667061920658 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02446790660090134 - cluster/prob_snapshot/cluster_24:0.027539128063562304 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.021312391747741202 - cluster/prob_snapshot/cluster_27:0.023407521054514657 - cluster/prob_snapshot/cluster_28:0.024612573098820225 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013207720143378588 - cluster/prob_snapshot/cluster_31:0.02270865112141965 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.02068596898142644 - cluster/prob_snapshot/cluster_34:0.016821908370061443 - cluster/prob_snapshot/cluster_35:0.02849228460424573 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.018580656984878596 - cluster/prob_snapshot/cluster_38:0.023133865183247282 - cluster/prob_snapshot/cluster_39:0.02507358403070529 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.024612573098820225 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.021967397786423495 - cluster/prob_snapshot/cluster_44:0.02290561687806139 - cluster/prob_snapshot/cluster_45:0.018248173539643094 - cluster/prob_snapshot/cluster_46:0.024612573098820225 - cluster/prob_snapshot/cluster_47:0.0240220975971105 - cluster/prob_snapshot/cluster_48:0.016811725680287055 - cluster/prob_snapshot/cluster_49:0.02999797228975872 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.013216257211621977 - cluster/prob_snapshot/cluster_52:0.01473490227344612 - cluster/prob_snapshot/cluster_53:0.021766842696192636 - cluster/prob_snapshot/cluster_54:0.019766333338151084 - cluster/prob_snapshot/cluster_55:0.022477737327186897 - cluster/prob_snapshot/cluster_56:0.02446643926927959 - cluster/prob_snapshot/cluster_57:0.020799081753688584 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.033476509129805376 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.01586830616923655
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 15:30:28,139:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▌        | 129/800 [3:59:54<22:19:01, 119.73s/it]
[36m(TaskRunner pid=2823680)[0m step:129 - global_seqlen/min:294142 - global_seqlen/max:425342 - global_seqlen/minmax_diff:131200 - global_seqlen/balanced_min:370938 - global_seqlen/balanced_max:371132 - global_seqlen/mean:371013.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.2144693873124197) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01105456706136465 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07824930887727533) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00027001887804090074) - actor/ppo_kl:np.float64(2.5860820621540864e-05) - actor/pg_clipfrac_lower:np.float64(1.998823033015166e-07) - actor/grad_norm:np.float64(0.2246318869292736) - perf/mfu/actor:np.float64(0.20885641469873922) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.1681900024414) - actor/lr:np.float64(1e-06) - training/global_step:129 - training/epoch:0 - critic/score/mean:0.6223958134651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6130796074867249 - critic/rewards/max:1.0080252885818481 - critic/rewards/min:-0.04785415902733803 - critic/advantages/mean:-0.1381782442331314 - critic/advantages/max:2.474778890609741 - critic/advantages/min:-2.474853038787842 - critic/returns/mean:-0.1381782442331314 - critic/returns/max:2.474778890609741 - critic/returns/min:-2.474853038787842 - response_length/mean:1150.6888427734375 - response_length/max:8192.0 - response_length/min:122.0 - response_length/clip_ratio:0.010416666977107525 - response_length_non_aborted/mean:1150.6888427734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:122.0 - response_length_non_aborted/clip_ratio:0.010416666977107525 - response/aborted_ratio:0.0 - prompt_length/mean:229.1979217529297 - prompt_length/max:408.0 - prompt_length/min:183.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.174365550279617e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0224066646769643) - timing_s/agent_loop/generate_sequences/max:np.float64(27.73511112947017) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.519213246970139) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.73511112947017) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:274 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.288557496853173 - timing_s/reward:0.00013715960085391998 - timing_s/old_log_prob:9.422936939634383 - timing_s/ref:19.915784304961562 - timing_s/adv:0.0697980122640729 - timing_s/update_actor:20.786216339096427 - timing_s/update_weights:27.787751315161586 - timing_s/step:108.65316318441182 - timing_s/stop_profile:5.169212818145752e-05 - timing_per_token_ms/adv:6.586252859305225e-05 - timing_per_token_ms/update_actor:0.01961420853641974 - timing_per_token_ms/gen:0.034273581037685956 - timing_per_token_ms/ref:0.018792854849159723 - perf/total_num_tokens:1484054 - perf/time_per_step:108.65316318441182 - perf/throughput:3414.658985770129 - frontier/active_count:47.0 - frontier/completed_count:17.0 - frontier/blacklisted_count:1176.0 - frontier/mean_score:2.5542347584951712 - frontier/mean_frontier_pct:0.48998090668440725 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:7.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:4.044401281099999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9738699999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.6678699999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.1282076675378994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.3596463929999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.2731495249 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.83339254314957 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.6601 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.9145751485989995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.196111098959299 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.416809565529509 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.4207443929999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9932177692715096 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.5526045899999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:1.9757089999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.9475403108999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.8045483929999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:2.5478261869999996 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.2588043929999992 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9423519899999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.7147443929999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4729394099999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.284311098959299 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.2212562999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.7655773325699986 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.998225029178277 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.9423519899999997 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.7382909999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:1.8270562999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9423519899999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9102338129999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.0097863929999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.5861587127989996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.7615089999999998 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.6021542999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.5540999999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.6871394099999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9248821723409995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.6405231683699997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:4.0020063266387 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.8970037 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:129.0 - cluster/prob_snapshot/cluster_0:0.03368957798457455 - cluster/prob_snapshot/cluster_1:0.016442198158518478 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.017156725967635237 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013893239697978189 - cluster/prob_snapshot/cluster_6:0.01405349995593418 - cluster/prob_snapshot/cluster_7:0.017727820065220383 - cluster/prob_snapshot/cluster_8:0.019655688357257246 - cluster/prob_snapshot/cluster_9:0.010605245924923857 - cluster/prob_snapshot/cluster_10:0.04026181966004387 - cluster/prob_snapshot/cluster_11:0.013828516144911531 - cluster/prob_snapshot/cluster_12:0.01594827622013593 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.026623380478820315 - cluster/prob_snapshot/cluster_15:0.020131853560689986 - cluster/prob_snapshot/cluster_16:0.020164630396545126 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.024933293324261883 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.02126301655586427 - cluster/prob_snapshot/cluster_21:0.016457516899070546 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.024552803311281367 - cluster/prob_snapshot/cluster_24:0.031691620310040185 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.021223212795227927 - cluster/prob_snapshot/cluster_27:0.018815681595565077 - cluster/prob_snapshot/cluster_28:0.024509584963392308 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01315245414313721 - cluster/prob_snapshot/cluster_31:0.022613629701770104 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.020599411214127457 - cluster/prob_snapshot/cluster_34:0.016751519265662807 - cluster/prob_snapshot/cluster_35:0.02735808027038781 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.018502908623900034 - cluster/prob_snapshot/cluster_38:0.023037064510237692 - cluster/prob_snapshot/cluster_39:0.02497500344682227 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.024509584963392308 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.022809771280625224 - cluster/prob_snapshot/cluster_44:0.022809771280625224 - cluster/prob_snapshot/cluster_45:0.015219250371792253 - cluster/prob_snapshot/cluster_46:0.024509584963392308 - cluster/prob_snapshot/cluster_47:0.024242042809793352 - cluster/prob_snapshot/cluster_48:0.016741379184039517 - cluster/prob_snapshot/cluster_49:0.029872449646500886 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0131609554891204 - cluster/prob_snapshot/cluster_52:0.014673245976692346 - cluster/prob_snapshot/cluster_53:0.021675762152340797 - cluster/prob_snapshot/cluster_54:0.021275473215901775 - cluster/prob_snapshot/cluster_55:0.022383682136505656 - cluster/prob_snapshot/cluster_56:0.024364062612000133 - cluster/prob_snapshot/cluster_57:0.021995372125063244 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.033336430998109366 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.015801907290167408
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▋        | 130/800 [4:01:44<21:44:39, 116.83s/it]
[36m(TaskRunner pid=2823680)[0m step:130 - global_seqlen/min:280921 - global_seqlen/max:430956 - global_seqlen/minmax_diff:150035 - global_seqlen/balanced_min:366013 - global_seqlen/balanced_max:366204 - global_seqlen/mean:366120.75 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.2531870399400884) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011680173687636852 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.01645957304572221) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00029687843661853477) - actor/ppo_kl:np.float64(0.00011898345876929005) - actor/pg_clipfrac_lower:np.float64(1.1576447671840844e-06) - actor/grad_norm:np.float64(0.2514240638567851) - perf/mfu/actor:np.float64(0.19321344717681904) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.60549545288086) - actor/lr:np.float64(1e-06) - training/global_step:130 - training/epoch:0 - critic/score/mean:0.5723039507865906 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5628165602684021 - critic/rewards/max:1.0018712282180786 - critic/rewards/min:-0.043337106704711914 - critic/advantages/mean:-0.14153917133808136 - critic/advantages/max:2.4747507572174072 - critic/advantages/min:-2.4748525619506836 - critic/returns/mean:-0.14153917133808136 - critic/returns/max:2.4747507572174072 - critic/returns/min:-2.4748525619506836 - response_length/mean:1149.69970703125 - response_length/max:8192.0 - response_length/min:175.0 - response_length/clip_ratio:0.013480392284691334 - response_length_non_aborted/mean:1149.69970703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:175.0 - response_length_non_aborted/clip_ratio:0.013480392284691334 - response/aborted_ratio:0.0 - prompt_length/mean:234.71568298339844 - prompt_length/max:322.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010184850543737411 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4384995670989156) - timing_s/agent_loop/generate_sequences/max:np.float64(28.38157466147095) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.529864962563806) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.38157466147095) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.686863576062024 - timing_s/reward:0.00013182125985622406 - timing_s/old_log_prob:10.977307029999793 - timing_s/ref:19.962882233783603 - timing_s/adv:0.08933822810649872 - timing_s/update_actor:22.188152535818517 - timing_s/update_weights:26.45131209399551 - timing_s/step:109.75936042238027 - timing_s/stop_profile:5.637388676404953e-05 - timing_per_token_ms/adv:7.908256396396044e-05 - timing_per_token_ms/update_actor:0.01964104313848975 - timing_per_token_ms/gen:0.03164387929080165 - timing_per_token_ms/ref:0.017671224789417565 - perf/total_num_tokens:1464483 - perf/time_per_step:109.75936042238027 - perf/throughput:3335.667669628174 - frontier/active_count:47.0 - frontier/completed_count:17.0 - frontier/blacklisted_count:1202.0 - frontier/mean_score:2.560181041144701 - frontier/mean_frontier_pct:0.507677197198847 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:7.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:4.044401281099999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9738699999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.6678699999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.1282076675378994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.3596463929999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.2731495249 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.83339254314957 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.6601 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:1.9145751485989995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.1372777692715093 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.416809565529509 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:1.9945210750999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9932177692715096 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.0868232129999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.2829962999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.9475403108999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:4.1631838751 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.6834783308999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.2588043929999992 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9596463929999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.8003210750999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.6310575869999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.284311098959299 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.2212562999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.8359041327989987 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.998225029178277 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.9596463929999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.8168036999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.7382909999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9423519899999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9102338129999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.0097863929999993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.5861587127989996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.7615089999999998 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.7215080099999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.5540999999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.6871394099999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.9474175206386994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:4.0020063266387 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.8970037 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:130.0 - cluster/prob_snapshot/cluster_0:0.03361133048964346 - cluster/prob_snapshot/cluster_1:0.016404009469492633 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.017116877714599788 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013860971226009149 - cluster/prob_snapshot/cluster_6:0.0140208592631042 - cluster/prob_snapshot/cluster_7:0.017686645387659034 - cluster/prob_snapshot/cluster_8:0.01961003600815967 - cluster/prob_snapshot/cluster_9:0.010580614155207613 - cluster/prob_snapshot/cluster_10:0.04016830746077535 - cluster/prob_snapshot/cluster_11:0.013796398000022659 - cluster/prob_snapshot/cluster_12:0.01591123471539324 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02607260571139871 - cluster/prob_snapshot/cluster_15:0.0200850952691446 - cluster/prob_snapshot/cluster_16:0.016575631932722535 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.024875383197162666 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.01734271646420942 - cluster/prob_snapshot/cluster_21:0.018973029087030375 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.024495776911202288 - cluster/prob_snapshot/cluster_24:0.03459848303605579 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.02230126804260761 - cluster/prob_snapshot/cluster_27:0.018771980248194438 - cluster/prob_snapshot/cluster_28:0.024596385505186114 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013121906221481206 - cluster/prob_snapshot/cluster_31:0.02327229930722904 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.021865621125975083 - cluster/prob_snapshot/cluster_34:0.016712612146688854 - cluster/prob_snapshot/cluster_35:0.0272945383272901 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01845993372378635 - cluster/prob_snapshot/cluster_38:0.023568015243662487 - cluster/prob_snapshot/cluster_39:0.02491699644369198 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.024596385505186114 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.02340927952119536 - cluster/prob_snapshot/cluster_44:0.022756793250936713 - cluster/prob_snapshot/cluster_45:0.018108255718507646 - cluster/prob_snapshot/cluster_46:0.024452658942352077 - cluster/prob_snapshot/cluster_47:0.02418573818280315 - cluster/prob_snapshot/cluster_48:0.016702495616443554 - cluster/prob_snapshot/cluster_49:0.029803067823037138 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01313038782221775 - cluster/prob_snapshot/cluster_52:0.01463916586026258 - cluster/prob_snapshot/cluster_53:0.022617316828028214 - cluster/prob_snapshot/cluster_54:0.021226058750592053 - cluster/prob_snapshot/cluster_55:0.022331693742489042 - cluster/prob_snapshot/cluster_56:0.024494756452606167 - cluster/prob_snapshot/cluster_57:0.017854174683083035 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.033259003723219195 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.01576520574225383
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▋        | 131/800 [4:03:38<21:33:56, 116.05s/it]
[36m(TaskRunner pid=2823680)[0m step:131 - global_seqlen/min:346825 - global_seqlen/max:457729 - global_seqlen/minmax_diff:110904 - global_seqlen/balanced_min:408365 - global_seqlen/balanced_max:408457 - global_seqlen/mean:408424.0 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.23250808287411928) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011171076446771622 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.019497009285259992) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00033001459880986) - actor/ppo_kl:np.float64(6.0489437526133436e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22339594499631363) - perf/mfu/actor:np.float64(0.22566737121620067) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.71738815307617) - actor/lr:np.float64(1e-06) - training/global_step:131 - training/epoch:0 - critic/score/mean:0.5582386255264282 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5495896339416504 - critic/rewards/max:1.015554666519165 - critic/rewards/min:-0.04296905919909477 - critic/advantages/mean:-0.12976638972759247 - critic/advantages/max:2.4748477935791016 - critic/advantages/min:-2.4748547077178955 - critic/returns/mean:-0.12976638972759247 - critic/returns/max:2.4748477935791016 - critic/returns/min:-2.4748547077178955 - response_length/mean:1218.3863525390625 - response_length/max:8192.0 - response_length/min:124.0 - response_length/clip_ratio:0.014204545877873898 - response_length_non_aborted/mean:1218.3863525390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:124.0 - response_length_non_aborted/clip_ratio:0.014204545877873898 - response/aborted_ratio:0.0 - prompt_length/mean:235.36363220214844 - prompt_length/max:404.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.64928588271141e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0574636608362198) - timing_s/agent_loop/generate_sequences/max:np.float64(31.014073335565627) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.151306465814741) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.014073335565627) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.653132708743215 - timing_s/reward:0.0001265108585357666 - timing_s/old_log_prob:10.617272233590484 - timing_s/ref:21.240357723087072 - timing_s/adv:0.12294287513941526 - timing_s/update_actor:21.438476029783487 - timing_s/update_weights:27.498063107952476 - timing_s/step:113.94199356902391 - timing_s/stop_profile:5.551334470510483e-05 - timing_per_token_ms/adv:0.00012012709600896513 - timing_per_token_ms/update_actor:0.02094746739406657 - timing_per_token_ms/gen:0.03806862269948051 - timing_per_token_ms/ref:0.020753886620697912 - perf/total_num_tokens:1633696 - perf/time_per_step:113.94199356902391 - perf/throughput:3584.4905570533524 - frontier/active_count:44.0 - frontier/completed_count:20.0 - frontier/blacklisted_count:1242.0 - frontier/mean_score:2.4537975392569713 - frontier/mean_frontier_pct:0.5031846246007162 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:4.044401281099999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9738699999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.6678699999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.1282076675378994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.3596463929999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.6601 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:1.6402026040192996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.416809565529509 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:1.9945210750999998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9932177692715096 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.0868232129999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.2829962999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.9475403108999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.6834783308999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:1.8811630750999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9717524750999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.8602247525699993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.6310575869999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.284311098959299 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.8548794099999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.8359041327989987 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.9987575204247934 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.9596463929999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.8168036999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.2168036999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9423519899999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9102338129999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:1.7068504750999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.5861587127989996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.5330562999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:2.8050556069999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.6878699999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.6871394099999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.9474175206386994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:4.0020063266387 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:1.62790259 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:131.0 - cluster/prob_snapshot/cluster_0:0.03745957417575878 - cluster/prob_snapshot/cluster_1:0.018282144757454592 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.01907663103681692 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015447947826663248 - cluster/prob_snapshot/cluster_6:0.01562614183737711 - cluster/prob_snapshot/cluster_7:0.019711632808570308 - cluster/prob_snapshot/cluster_8:0.0218552371398479 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.015375981453616686 - cluster/prob_snapshot/cluster_12:0.015191690150939428 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.028676278937936794 - cluster/prob_snapshot/cluster_15:0.02238468710108134 - cluster/prob_snapshot/cluster_16:0.018473416697539432 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.027723426845945812 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.019328326618917403 - cluster/prob_snapshot/cluster_21:0.021145297733555515 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02730035850502136 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.02485459493229363 - cluster/prob_snapshot/cluster_27:0.017423485665903335 - cluster/prob_snapshot/cluster_28:0.02752461354248363 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014624265456625787 - cluster/prob_snapshot/cluster_31:0.02649163468989319 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.024369069731407424 - cluster/prob_snapshot/cluster_34:0.018626080112255004 - cluster/prob_snapshot/cluster_35:0.030419556981811712 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.017180044218333506 - cluster/prob_snapshot/cluster_38:0.026266375128096087 - cluster/prob_snapshot/cluster_39:0.02777473647246864 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027412486023752493 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.02608946536333887 - cluster/prob_snapshot/cluster_44:0.020532216479434276 - cluster/prob_snapshot/cluster_45:0.020181514340530376 - cluster/prob_snapshot/cluster_46:0.027252303854136586 - cluster/prob_snapshot/cluster_47:0.02695482268199275 - cluster/prob_snapshot/cluster_48:0.01580898816290249 - cluster/prob_snapshot/cluster_49:0.03321529417367828 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.014633718136638487 - cluster/prob_snapshot/cluster_52:0.014199292353563167 - cluster/prob_snapshot/cluster_53:0.02598065356881844 - cluster/prob_snapshot/cluster_54:0.02489527092930105 - cluster/prob_snapshot/cluster_55:0.024888504145197565 - cluster/prob_snapshot/cluster_56:0.027299221211617083 - cluster/prob_snapshot/cluster_57:0.019898342944025423 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03706690865348671 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.015077766418971489
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 15:36:02,534:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  16%|█▋        | 132/800 [4:05:24<20:58:58, 113.08s/it]
[36m(TaskRunner pid=2823680)[0m step:132 - global_seqlen/min:331898 - global_seqlen/max:352856 - global_seqlen/minmax_diff:20958 - global_seqlen/balanced_min:344354 - global_seqlen/balanced_max:344498 - global_seqlen/mean:344432.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.27544647455215454) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012418287806212902 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.041537456134392414) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006613075895150663) - actor/ppo_kl:np.float64(4.764506862819928e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.28150352959831554) - perf/mfu/actor:np.float64(0.19127451672210932) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.54838562011719) - actor/lr:np.float64(1e-06) - training/global_step:132 - training/epoch:0 - critic/score/mean:0.5924479365348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5832955241203308 - critic/rewards/max:1.0073257684707642 - critic/rewards/min:-0.05382349342107773 - critic/advantages/mean:-0.18091978132724762 - critic/advantages/max:2.474795341491699 - critic/advantages/min:-2.474839448928833 - critic/returns/mean:-0.18091978132724762 - critic/returns/max:2.474795341491699 - critic/returns/min:-2.474839448928833 - response_length/mean:1042.8138427734375 - response_length/max:8192.0 - response_length/min:147.0 - response_length/clip_ratio:0.013020833022892475 - response_length_non_aborted/mean:1042.8138427734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:147.0 - response_length_non_aborted/clip_ratio:0.013020833022892475 - response/aborted_ratio:0.0 - prompt_length/mean:231.7395782470703 - prompt_length/max:388.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.892081677913666e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1908959923312068) - timing_s/agent_loop/generate_sequences/max:np.float64(27.821245152503252) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.066258867683246) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.821245152503252) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:194 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.740207268856466 - timing_s/reward:0.0001619327813386917 - timing_s/old_log_prob:9.497854858636856 - timing_s/ref:18.665936858393252 - timing_s/adv:0.08351618982851505 - timing_s/update_actor:21.412832349538803 - timing_s/update_weights:26.112078144215047 - timing_s/step:105.91840142104775 - timing_s/stop_profile:6.251689046621323e-05 - timing_per_token_ms/adv:8.532011297719181e-05 - timing_per_token_ms/update_actor:0.021875342720682187 - timing_per_token_ms/gen:0.03713436486676106 - timing_per_token_ms/ref:0.01906911516022591 - perf/total_num_tokens:1377730 - perf/time_per_step:105.91840142104775 - perf/throughput:3251.8664875880154 - frontier/active_count:44.0 - frontier/completed_count:20.0 - frontier/blacklisted_count:1274.0 - frontier/mean_score:2.4780498651808087 - frontier/mean_frontier_pct:0.5218192424622784 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.731080896769999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9738699999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:1.4675089999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.1282076675378994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:1.9517524750999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.6601 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:1.6402026040192996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.591766695870656 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.6961647525699999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9932177692715096 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.0868232129999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.2829962999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.9632782176299997 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.6834783308999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:1.8811630750999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9802267325699994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.9021573267989993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.6310575869999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.284311098959299 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.1984155869999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.8359041327989987 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.9987575204247934 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:2.9596463929999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.8168036999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.2168036999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9423519899999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.5371636690999995 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:128.0 - frontier/cluster_48/score:1.4947953325699996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.4103110989592995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.5330562999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:2.8635389248999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.6878699999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.3809975869999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.9474175206386994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:4.301404428647089 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:2.039531813 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:132.0 - cluster/prob_snapshot/cluster_0:0.03421936511443926 - cluster/prob_snapshot/cluster_1:0.01810321997491709 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.018889930728669603 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013459163086814531 - cluster/prob_snapshot/cluster_6:0.015473210982313572 - cluster/prob_snapshot/cluster_7:0.019518717827285387 - cluster/prob_snapshot/cluster_8:0.017900370537737634 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.015225498883087468 - cluster/prob_snapshot/cluster_12:0.015043011213500997 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.028395628224301955 - cluster/prob_snapshot/cluster_15:0.023770219223662414 - cluster/prob_snapshot/cluster_16:0.015556264409244543 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.027452101561882377 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.019139163001465273 - cluster/prob_snapshot/cluster_21:0.02093835167504537 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02717751291657309 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.024611346503166896 - cluster/prob_snapshot/cluster_27:0.017252964459273795 - cluster/prob_snapshot/cluster_28:0.027332955116012254 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014481139824961015 - cluster/prob_snapshot/cluster_31:0.026616946652443997 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.024130573069217096 - cluster/prob_snapshot/cluster_34:0.018443789282714753 - cluster/prob_snapshot/cluster_35:0.0301218450508503 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.020162625181874932 - cluster/prob_snapshot/cluster_38:0.02600930980451436 - cluster/prob_snapshot/cluster_39:0.027502909028298196 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027144203873836124 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.02583413143077323 - cluster/prob_snapshot/cluster_44:0.020331270561034974 - cluster/prob_snapshot/cluster_45:0.019984000694699272 - cluster/prob_snapshot/cluster_46:0.026985589384612485 - cluster/prob_snapshot/cluster_47:0.032440865907583646 - cluster/prob_snapshot/cluster_48:0.013709417906444726 - cluster/prob_snapshot/cluster_49:0.031277445833495336 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.014490499992922306 - cluster/prob_snapshot/cluster_52:0.014060325873959521 - cluster/prob_snapshot/cluster_53:0.02626276049800761 - cluster/prob_snapshot/cluster_54:0.02465162440990561 - cluster/prob_snapshot/cluster_55:0.031008598870302946 - cluster/prob_snapshot/cluster_56:0.027032047568506086 - cluster/prob_snapshot/cluster_57:0.019703600656873106 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03945005019220152 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.018705433010573376
[36m(TaskRunner pid=2823680)[0m Training Progress:  17%|█▋        | 133/800 [4:07:04<20:12:37, 109.08s/it]
[36m(TaskRunner pid=2823680)[0m step:133 - global_seqlen/min:301459 - global_seqlen/max:337053 - global_seqlen/minmax_diff:35594 - global_seqlen/balanced_min:323335 - global_seqlen/balanced_max:323824 - global_seqlen/mean:323589.5 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.27718254812854404) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013177795335650444 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.007350498586674803) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00036695385360291496) - actor/ppo_kl:np.float64(-6.6457280880474576e-06) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.25591031337777775) - perf/mfu/actor:np.float64(0.19984589021494412) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.26212692260742) - actor/lr:np.float64(1e-06) - training/global_step:133 - training/epoch:0 - critic/score/mean:0.583776593208313 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.574497401714325 - critic/rewards/max:1.0019495487213135 - critic/rewards/min:-0.04815216735005379 - critic/advantages/mean:-0.1614154428243637 - critic/advantages/max:2.4748411178588867 - critic/advantages/min:-2.4748430252075195 - critic/returns/mean:-0.1614154428243637 - critic/returns/max:2.4748411178588867 - critic/returns/min:-2.4748430252075195 - response_length/mean:926.276611328125 - response_length/max:8192.0 - response_length/min:156.0 - response_length/clip_ratio:0.007978723384439945 - response_length_non_aborted/mean:926.276611328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:156.0 - response_length_non_aborted/clip_ratio:0.007978723384439945 - response/aborted_ratio:0.0 - prompt_length/mean:238.48936462402344 - prompt_length/max:411.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.481089025735855e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2400968912988901) - timing_s/agent_loop/generate_sequences/max:np.float64(27.106845255941153) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.547307955013821) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.106845255941153) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.805887249298394 - timing_s/reward:0.0001244163140654564 - timing_s/old_log_prob:9.271625813096762 - timing_s/ref:17.48909036628902 - timing_s/adv:0.07760010566562414 - timing_s/update_actor:18.942244770005345 - timing_s/update_weights:23.495940061286092 - timing_s/step:99.53174967970699 - timing_s/stop_profile:5.0904229283332825e-05 - timing_per_token_ms/adv:8.859430447357717e-05 - timing_per_token_ms/update_actor:0.02162593705475183 - timing_per_token_ms/gen:0.04279012181190191 - timing_per_token_ms/ref:0.01996690318378386 - perf/total_num_tokens:1294358 - perf/time_per_step:99.53174967970699 - perf/throughput:3251.1183721908887 - frontier/active_count:43.0 - frontier/completed_count:21.0 - frontier/blacklisted_count:1308.0 - frontier/mean_score:2.508456063918853 - frontier/mean_frontier_pct:0.5317501904390847 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.731080896769999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9738699999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.3272562999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.1282076675378994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:1.9517524750999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.6601 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:2.0481418228135095 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.591766695870656 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.4873153267989998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9932177692715096 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.3607762490999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.2829962999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.9742947523409997 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.6834783308999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:1.6168141525699995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9802267325699994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.9315101287592995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.6310575869999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.7990177692715092 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.1984155869999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.885132892959299 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.9987575204247934 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:2.9717524750999993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.8168036999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.2168036999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.5371636690999995 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:128.0 - frontier/cluster_48/score:1.4947953325699996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.287217769271509 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:2.8635389248999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.6878699999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.2666983108999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.9474175206386994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:3.9109831000529622 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:96.0 - frontier/cluster_63/score:2.3276722691 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:133.0 - cluster/prob_snapshot/cluster_0:0.03459072870710028 - cluster/prob_snapshot/cluster_1:0.01829968407605207 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.019094932560733302 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.012304949656233586 - cluster/prob_snapshot/cluster_6:0.01564113306973913 - cluster/prob_snapshot/cluster_7:0.01973054353335084 - cluster/prob_snapshot/cluster_8:0.018094633227610065 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.015390732689920836 - cluster/prob_snapshot/cluster_12:0.01898825571108363 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.028703790064183488 - cluster/prob_snapshot/cluster_15:0.024028184091792436 - cluster/prob_snapshot/cluster_16:0.013788851647723427 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.02775002383565987 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.021886679230535548 - cluster/prob_snapshot/cluster_21:0.021165583871681416 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02757458916590243 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.024878439654283976 - cluster/prob_snapshot/cluster_27:0.014989431017098818 - cluster/prob_snapshot/cluster_28:0.027629584360183757 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01463829552008392 - cluster/prob_snapshot/cluster_31:0.027177934322950303 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.024392448655686485 - cluster/prob_snapshot/cluster_34:0.018643949391688307 - cluster/prob_snapshot/cluster_35:0.03522056922541844 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02038143885360665 - cluster/prob_snapshot/cluster_38:0.026747972489870826 - cluster/prob_snapshot/cluster_39:0.027801382687035615 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027551019796924717 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.026114494781446878 - cluster/prob_snapshot/cluster_44:0.020551914446555904 - cluster/prob_snapshot/cluster_45:0.02020087585497489 - cluster/prob_snapshot/cluster_46:0.027438784706554656 - cluster/prob_snapshot/cluster_47:0.03279292844504409 - cluster/prob_snapshot/cluster_48:0.013858198536067817 - cluster/prob_snapshot/cluster_49:0.030475688199756445 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.014647757268702549 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02654777551973929 - cluster/prob_snapshot/cluster_54:0.024919154674572324 - cluster/prob_snapshot/cluster_55:0.03028545297372315 - cluster/prob_snapshot/cluster_56:0.02732541123169656 - cluster/prob_snapshot/cluster_57:0.019917432792677606 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.036258596137409234 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.02157977331694451
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 15:39:33,736:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  17%|█▋        | 134/800 [4:08:52<20:07:28, 108.78s/it]
[36m(TaskRunner pid=2823680)[0m step:134 - global_seqlen/min:341585 - global_seqlen/max:431892 - global_seqlen/minmax_diff:90307 - global_seqlen/balanced_min:382293 - global_seqlen/balanced_max:382615 - global_seqlen/mean:382481.75 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.23653133649458277) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010899411514401436 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.053116435708943754) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003943996858613397) - actor/ppo_kl:np.float64(2.2539339808761413e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.24608426044384638) - perf/mfu/actor:np.float64(0.22872876387054372) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.7800521850586) - actor/lr:np.float64(1e-06) - training/global_step:134 - training/epoch:0 - critic/score/mean:0.6436170339584351 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6351021528244019 - critic/rewards/max:1.00943922996521 - critic/rewards/min:-0.048146285116672516 - critic/advantages/mean:-0.15406298637390137 - critic/advantages/max:2.474827527999878 - critic/advantages/min:-2.474848747253418 - critic/returns/mean:-0.15406298637390137 - critic/returns/max:2.474827527999878 - critic/returns/min:-2.474848747253418 - response_length/mean:1059.2606201171875 - response_length/max:8192.0 - response_length/min:142.0 - response_length/clip_ratio:0.010638297535479069 - response_length_non_aborted/mean:1059.2606201171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:142.0 - response_length_non_aborted/clip_ratio:0.010638297535479069 - response/aborted_ratio:0.0 - prompt_length/mean:240.64894104003906 - prompt_length/max:1168.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.027309715747833e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4129383666440845) - timing_s/agent_loop/generate_sequences/max:np.float64(29.05272710788995) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.89149426197946) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.05272710788995) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.111968299373984 - timing_s/reward:0.00011718180030584335 - timing_s/old_log_prob:8.957496356219053 - timing_s/ref:19.976178410463035 - timing_s/adv:0.06661368440836668 - timing_s/update_actor:19.735094776377082 - timing_s/update_weights:27.623224031180143 - timing_s/step:107.84064295142889 - timing_s/stop_profile:5.444232374429703e-05 - timing_per_token_ms/adv:6.81447608961821e-05 - timing_per_token_ms/update_actor:0.020188694361286466 - timing_per_token_ms/gen:0.03905771325263756 - timing_per_token_ms/ref:0.020435319161380944 - perf/total_num_tokens:1529927 - perf/time_per_step:107.84064295142889 - perf/throughput:3546.7309868716998 - frontier/active_count:42.0 - frontier/completed_count:22.0 - frontier/blacklisted_count:1342.0 - frontier/mean_score:2.6081179829945205 - frontier/mean_frontier_pct:0.5560727839089251 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.511756627738999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.2817089999999998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.1282076675378994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:1.9517524750999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:2.3336992759694564 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.591766695870656 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.9411207287592998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9932177692715096 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.3607762490999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.2829962999999998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.9742947523409997 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.7784348316299994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.031769906798999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.3861587127989994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.9315101287592995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.741740310899999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.7990177692715092 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.1984155869999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.885132892959299 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.9987575204247934 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:2.9717524750999993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.8168036999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:3.051762589999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:2.9717524750999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.5371636690999995 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.3463567327989998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.801052438490056 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:2.8635389248999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.6878699999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.7866888176299995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.9631922644470894 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:3.9109831000529622 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:96.0 - frontier/cluster_63/score:2.3276722691 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:134.0 - cluster/prob_snapshot/cluster_0:0.03205884610534523 - cluster/prob_snapshot/cluster_1:0.020829734358692502 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.018802541639579454 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.015401628411023788 - cluster/prob_snapshot/cluster_7:0.019428419826957345 - cluster/prob_snapshot/cluster_8:0.01781755937775308 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.018824648686151062 - cluster/prob_snapshot/cluster_12:0.02130435388190028 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.028264263630100553 - cluster/prob_snapshot/cluster_15:0.023660252816942017 - cluster/prob_snapshot/cluster_16:0.017720502105311335 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.02732510193527908 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.021551539722665547 - cluster/prob_snapshot/cluster_21:0.020841486127669152 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.027152353606755348 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.02536434728345789 - cluster/prob_snapshot/cluster_27:0.018548038964043435 - cluster/prob_snapshot/cluster_28:0.021783256377251786 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014414146799075019 - cluster/prob_snapshot/cluster_31:0.026761772536231687 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.025029362796292597 - cluster/prob_snapshot/cluster_34:0.01835846414479136 - cluster/prob_snapshot/cluster_35:0.03468125469018096 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02006934832058735 - cluster/prob_snapshot/cluster_38:0.026338394488459402 - cluster/prob_snapshot/cluster_39:0.02737567435487166 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02712914514345166 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.02571461689969324 - cluster/prob_snapshot/cluster_44:0.02785955793464259 - cluster/prob_snapshot/cluster_45:0.01989155019066242 - cluster/prob_snapshot/cluster_46:0.02712914514345166 - cluster/prob_snapshot/cluster_47:0.032290787129546805 - cluster/prob_snapshot/cluster_48:0.012290898224199629 - cluster/prob_snapshot/cluster_49:0.03469982919697834 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.014423463664783719 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.026141263031983008 - cluster/prob_snapshot/cluster_54:0.02453758042357672 - cluster/prob_snapshot/cluster_55:0.034568703620954426 - cluster/prob_snapshot/cluster_56:0.02705099893201341 - cluster/prob_snapshot/cluster_57:0.019612447346787812 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03570338682778519 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.021249333302120477
[36m(TaskRunner pid=2823680)[0m Training Progress:  17%|█▋        | 135/800 [4:10:37<19:53:59, 107.73s/it]
[36m(TaskRunner pid=2823680)[0m step:135 - global_seqlen/min:284666 - global_seqlen/max:381675 - global_seqlen/minmax_diff:97009 - global_seqlen/balanced_min:330488 - global_seqlen/balanced_max:330708 - global_seqlen/mean:330592.5 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.2576034293719092) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011490179225802422 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.00020586742903105915) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005945515443885114) - actor/ppo_kl:np.float64(-7.039498260679948e-06) - actor/pg_clipfrac_lower:np.float64(3.416636659216126e-06) - actor/grad_norm:np.float64(0.24010216616667235) - perf/mfu/actor:np.float64(0.18605640112926086) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.81441879272461) - actor/lr:np.float64(1e-06) - training/global_step:135 - training/epoch:0 - critic/score/mean:0.6173469424247742 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6084623336791992 - critic/rewards/max:1.004891037940979 - critic/rewards/min:-0.060945991426706314 - critic/advantages/mean:-0.15303978323936462 - critic/advantages/max:2.4747588634490967 - critic/advantages/min:-2.4748435020446777 - critic/returns/mean:-0.15303978323936462 - critic/returns/max:2.4747588634490967 - critic/returns/min:-2.4748435020446777 - response_length/mean:1029.5421142578125 - response_length/max:8192.0 - response_length/min:3.0 - response_length/clip_ratio:0.005102040711790323 - response_length_non_aborted/mean:1029.5421142578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:3.0 - response_length_non_aborted/clip_ratio:0.005102040711790323 - response/aborted_ratio:0.0 - prompt_length/mean:234.16326904296875 - prompt_length/max:406.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.261948823928833e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.11466011591255665) - timing_s/agent_loop/generate_sequences/max:np.float64(26.12294291611761) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.630402158088145) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(26.12294291611761) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:308 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.099842176772654 - timing_s/reward:0.0001265285536646843 - timing_s/old_log_prob:9.625113948248327 - timing_s/ref:19.55317780189216 - timing_s/adv:0.13394083734601736 - timing_s/update_actor:20.717871516942978 - timing_s/update_weights:25.933465185575187 - timing_s/step:104.46758264862001 - timing_s/stop_profile:5.224160850048065e-05 - timing_per_token_ms/adv:0.00013519203967319275 - timing_per_token_ms/update_actor:0.0209114065848861 - timing_per_token_ms/gen:0.03481318123245877 - timing_per_token_ms/ref:0.019735832935712176 - perf/total_num_tokens:1322370 - perf/time_per_step:104.46758264862001 - perf/throughput:3164.546279509101 - frontier/active_count:42.0 - frontier/completed_count:22.0 - frontier/blacklisted_count:1372.0 - frontier/mean_score:2.614304838548066 - frontier/mean_frontier_pct:0.5713344112795622 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.3582296394172992 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.4971962999999997 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.1282076675378994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.6662267325699995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:2.5335894931786194 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.591766695870656 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.9411207287592998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9952524384900565 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.3607762490999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.4980974099999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.9820063266387 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.7784348316299994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.031769906798999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.5703110989592997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:1.5789394099999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.9520570901315093 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.819218217629999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:4.159312438490057 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.8388909108999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.885132892959299 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.9987575204247934 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:2.9802267325699994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.8168036999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:3.051762589999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:2.9717524750999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.3760145683699996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.3463567327989998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.801052438490056 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:2.8635389248999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.6878699999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.7866888176299995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.9631922644470894 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:4.237688170037073 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:112.0 - frontier/cluster_63/score:1.9293705883699999 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:135.0 - cluster/prob_snapshot/cluster_0:0.03058474565726688 - cluster/prob_snapshot/cluster_1:0.02274296932982232 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.018758044682893944 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.015365179849684132 - cluster/prob_snapshot/cluster_7:0.01938244170484601 - cluster/prob_snapshot/cluster_8:0.015174995844487501 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.018780099412267555 - cluster/prob_snapshot/cluster_12:0.02307441675198759 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.028197375134991518 - cluster/prob_snapshot/cluster_15:0.02360425989508365 - cluster/prob_snapshot/cluster_16:0.01767856583787791 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.027278966552871667 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.021500537153548674 - cluster/prob_snapshot/cluster_21:0.022751176100388494 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.027158328893919986 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.025304321554805734 - cluster/prob_snapshot/cluster_27:0.01850414429798154 - cluster/prob_snapshot/cluster_28:0.02340885515957767 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014380035151933288 - cluster/prob_snapshot/cluster_31:0.026885569172413658 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.025675752225660205 - cluster/prob_snapshot/cluster_34:0.018315018115019316 - cluster/prob_snapshot/cluster_35:0.03788052834361794 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.016747517841303756 - cluster/prob_snapshot/cluster_38:0.026276063638663444 - cluster/prob_snapshot/cluster_39:0.02731088874134369 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02714212141622831 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.02565376224417361 - cluster/prob_snapshot/cluster_44:0.027793627191530407 - cluster/prob_snapshot/cluster_45:0.01984447605227156 - cluster/prob_snapshot/cluster_46:0.027064942951029868 - cluster/prob_snapshot/cluster_47:0.03074672014589812 - cluster/prob_snapshot/cluster_48:0.012261811328587762 - cluster/prob_snapshot/cluster_49:0.03461771068202596 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01438932996891167 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.026079398701557052 - cluster/prob_snapshot/cluster_54:0.024479511271320368 - cluster/prob_snapshot/cluster_55:0.03448689541985155 - cluster/prob_snapshot/cluster_56:0.02698698167568441 - cluster/prob_snapshot/cluster_57:0.01956603371628959 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.038594327598717025 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.017571552591664565
[36m(TaskRunner pid=2823680)[0m Training Progress:  17%|█▋        | 136/800 [4:12:31<20:12:32, 109.57s/it]
[36m(TaskRunner pid=2823680)[0m step:136 - global_seqlen/min:333268 - global_seqlen/max:458142 - global_seqlen/minmax_diff:124874 - global_seqlen/balanced_min:386361 - global_seqlen/balanced_max:386422 - global_seqlen/mean:386386.0 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.25418722998113075) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010934760794043541 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.018103036840329878) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00046689410060315957) - actor/ppo_kl:np.float64(8.534475443382767e-05) - actor/pg_clipfrac_lower:np.float64(5.845861178773277e-07) - actor/grad_norm:np.float64(0.22210041185220084) - perf/mfu/actor:np.float64(0.210418299915949) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.69354438781738) - actor/lr:np.float64(1e-06) - training/global_step:136 - training/epoch:0 - critic/score/mean:0.602393627166748 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5929773449897766 - critic/rewards/max:1.002040147781372 - critic/rewards/min:-0.05193846672773361 - critic/advantages/mean:-0.13695643842220306 - critic/advantages/max:2.4748456478118896 - critic/advantages/min:-2.4748430252075195 - critic/returns/mean:-0.13695643842220306 - critic/returns/max:2.4748456478118896 - critic/returns/min:-2.4748430252075195 - response_length/mean:1187.2965087890625 - response_length/max:8192.0 - response_length/min:154.0 - response_length/clip_ratio:0.010638297535479069 - response_length_non_aborted/mean:1187.2965087890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:154.0 - response_length_non_aborted/clip_ratio:0.010638297535479069 - response/aborted_ratio:0.0 - prompt_length/mean:242.6914825439453 - prompt_length/max:1168.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.332636207342148e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8145820870995522) - timing_s/agent_loop/generate_sequences/max:np.float64(29.252346655353904) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.341097505927792) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.252346655353904) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:198 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.111211919225752 - timing_s/reward:0.00021585915237665176 - timing_s/old_log_prob:10.35073725041002 - timing_s/ref:21.58578715659678 - timing_s/adv:0.07344134524464607 - timing_s/update_actor:21.52960404381156 - timing_s/update_weights:28.56929265987128 - timing_s/step:113.59309023153037 - timing_s/stop_profile:6.21965155005455e-05 - timing_per_token_ms/adv:6.829523127299465e-05 - timing_per_token_ms/update_actor:0.020021001555595858 - timing_per_token_ms/gen:0.034844953188201064 - timing_per_token_ms/ref:0.02007324785730127 - perf/total_num_tokens:1545544 - perf/time_per_step:113.59309023153037 - perf/throughput:3401.492108476416 - frontier/active_count:40.0 - frontier/completed_count:24.0 - frontier/blacklisted_count:1406.0 - frontier/mean_score:2.628949259330562 - frontier/mean_frontier_pct:0.5772184673703429 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.3582296394172992 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.4971962999999997 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.059648372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.4663587127989997 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:2.5335894931786194 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.591766695870656 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.9411207287592998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:2.9952524384900565 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.3607762490999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.4980974099999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.9874044286470895 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.7784348316299994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:1.7222389347592995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.5703110989592997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:1.4052575869999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.9520570901315093 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.873452752340999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.811518706943039 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.8388909108999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.919593025071509 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.9987575204247934 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:2.9802267325699994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.8717625899999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.436233812999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:2.9802267325699994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.3760145683699996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.560736706943039 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:1.5799599883699997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:2.8635389248999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.6878699999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.7866888176299995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:3.5742345851129627 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.866381719025951 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:112.0 - frontier/cluster_63/score:2.2505594118589998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:136.0 - cluster/prob_snapshot/cluster_0:0.031935093721364195 - cluster/prob_snapshot/cluster_1:0.02374709488151065 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.019586231701600718 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.01604356838680111 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.013944342094038692 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.019609260170022142 - cluster/prob_snapshot/cluster_12:0.02409317604920012 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.029442318328335184 - cluster/prob_snapshot/cluster_15:0.02464641231351329 - cluster/prob_snapshot/cluster_16:0.01845909275226549 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.02848336106012151 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.02244980804328219 - cluster/prob_snapshot/cluster_21:0.023755663989461304 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.028408730389568315 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.02642153344885689 - cluster/prob_snapshot/cluster_27:0.016377635747882903 - cluster/prob_snapshot/cluster_28:0.024442380257405635 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013363300775133978 - cluster/prob_snapshot/cluster_31:0.02807259477958909 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.027325106619523518 - cluster/prob_snapshot/cluster_34:0.019123645053842574 - cluster/prob_snapshot/cluster_35:0.03624564731912711 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0174869380264138 - cluster/prob_snapshot/cluster_38:0.02776387766623306 - cluster/prob_snapshot/cluster_39:0.028516692646137265 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02834047406956123 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.027309034016229625 - cluster/prob_snapshot/cluster_44:0.023167371948635135 - cluster/prob_snapshot/cluster_45:0.020720630136418523 - cluster/prob_snapshot/cluster_46:0.02834047406956123 - cluster/prob_snapshot/cluster_47:0.032104219550719584 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.03386083522062488 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.015024633727357697 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0272308310510068 - cluster/prob_snapshot/cluster_54:0.025560306940694257 - cluster/prob_snapshot/cluster_55:0.03600952742041744 - cluster/prob_snapshot/cluster_56:0.03398919332911649 - cluster/prob_snapshot/cluster_57:0.020429894284133707 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03676736727899505 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.02140170073529761
[36m(TaskRunner pid=2823680)[0m Training Progress:  17%|█▋        | 137/800 [4:14:27<20:33:08, 111.60s/it]
[36m(TaskRunner pid=2823680)[0m step:137 - global_seqlen/min:397916 - global_seqlen/max:452661 - global_seqlen/minmax_diff:54745 - global_seqlen/balanced_min:423081 - global_seqlen/balanced_max:423166 - global_seqlen/mean:423108.25 - frontier/skipped_zero_acc_count:41.0 - actor/entropy:np.float64(0.2767260404388336) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010541055351495743 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04498653846530942) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000328290779386655) - actor/ppo_kl:np.float64(-3.9223309998574635e-05) - actor/pg_clipfrac_lower:np.float64(3.0366635325084315e-06) - actor/grad_norm:np.float64(0.23707115243781696) - perf/mfu/actor:np.float64(0.24920744080002463) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.10360336303711) - actor/lr:np.float64(1e-06) - training/global_step:137 - training/epoch:0 - critic/score/mean:0.5517241358757019 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5422219038009644 - critic/rewards/max:1.0078102350234985 - critic/rewards/min:-0.06383417546749115 - critic/advantages/mean:-0.1343315690755844 - critic/advantages/max:2.4748551845550537 - critic/advantages/min:-2.4748592376708984 - critic/returns/mean:-0.1343315690755844 - critic/returns/max:2.4748551845550537 - critic/returns/min:-2.4748592376708984 - response_length/mean:1289.0804443359375 - response_length/max:8192.0 - response_length/min:143.0 - response_length/clip_ratio:0.015804598107933998 - response_length_non_aborted/mean:1289.0804443359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:143.0 - response_length_non_aborted/clip_ratio:0.015804598107933998 - response/aborted_ratio:0.0 - prompt_length/mean:237.42529296875 - prompt_length/max:528.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.200535714626312e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.639779495075345) - timing_s/agent_loop/generate_sequences/max:np.float64(31.18705634959042) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.076445152895758) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.18705634959042) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:196 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.897129675373435 - timing_s/reward:0.0001221131533384323 - timing_s/old_log_prob:9.704697275534272 - timing_s/ref:22.99003418534994 - timing_s/adv:0.0868408726528287 - timing_s/update_actor:20.047700294293463 - timing_s/update_weights:29.914539579302073 - timing_s/step:116.0286606317386 - timing_s/stop_profile:5.409028381109238e-05 - timing_per_token_ms/adv:8.173658631088646e-05 - timing_per_token_ms/update_actor:0.01886934729444967 - timing_per_token_ms/gen:0.03666643967384467 - timing_per_token_ms/ref:0.021638738258578247 - perf/total_num_tokens:1692433 - perf/time_per_step:116.0286606317386 - perf/throughput:3646.5839362129336 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1446.0 - frontier/mean_score:2.533974991115141 - frontier/mean_frontier_pct:0.5838538557631304 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.3582296394172992 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.0480374099999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.341753861099999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.4663587127989997 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:2.5335894931786194 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.114236687109459 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.9411207287592998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:2.9966767069430396 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.3607762490999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.0486681869999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.9874044286470895 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.2449043821409993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:1.7222389347592995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.6992177692715096 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:1.4052575869999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.9520570901315093 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.873452752340999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.811518706943039 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.8388909108999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.919593025071509 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.999130264297355 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:2.9861587127989995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.8717625899999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.436233812999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:2.9861587127989995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.2632101978589994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.560736706943039 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.0059719918589995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:2.9044772474299996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.6878699999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.866381719025951 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:1.8753915883012997 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:137.0 - cluster/prob_snapshot/cluster_0:0.03487582336252681 - cluster/prob_snapshot/cluster_1:0.02126923963526225 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.024319538205377078 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.01752093361756361 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.015228400956687907 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02141497062530964 - cluster/prob_snapshot/cluster_12:0.02631178600775593 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.032153501798410866 - cluster/prob_snapshot/cluster_15:0.021956731124259134 - cluster/prob_snapshot/cluster_16:0.020158890525811463 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.031121030640441312 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.02451708915187417 - cluster/prob_snapshot/cluster_21:0.021275790368712678 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.031024736350074127 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0233137388159366 - cluster/prob_snapshot/cluster_27:0.01788576343074433 - cluster/prob_snapshot/cluster_28:0.028031865669091344 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014593854692904945 - cluster/prob_snapshot/cluster_31:0.03065764783416825 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.029841327542031858 - cluster/prob_snapshot/cluster_34:0.020884637845897083 - cluster/prob_snapshot/cluster_35:0.039583312470965276 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01909721534190031 - cluster/prob_snapshot/cluster_38:0.030320502635587193 - cluster/prob_snapshot/cluster_39:0.031146511278184057 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.031011799365250947 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.029823774899840026 - cluster/prob_snapshot/cluster_44:0.025300729626919104 - cluster/prob_snapshot/cluster_45:0.022628680626496436 - cluster/prob_snapshot/cluster_46:0.031011799365250947 - cluster/prob_snapshot/cluster_47:0.03388902924312042 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.036978895955834166 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.020832382645038405 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0301635225455943 - cluster/prob_snapshot/cluster_54:0.02791401702883559 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.02231117248583125 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04015307479337507 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.019476281491135183
[36m(TaskRunner pid=2823680)[0m Training Progress:  17%|█▋        | 138/800 [4:16:23<20:46:02, 112.93s/it]
[36m(TaskRunner pid=2823680)[0m step:138 - global_seqlen/min:327762 - global_seqlen/max:413926 - global_seqlen/minmax_diff:86164 - global_seqlen/balanced_min:368659 - global_seqlen/balanced_max:368753 - global_seqlen/mean:368690.75 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.23803945049187358) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011715376749634743 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.029154998541343957) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00043055803639525003) - actor/ppo_kl:np.float64(4.216631316351094e-05) - actor/pg_clipfrac_lower:np.float64(1.2831322757327662e-06) - actor/grad_norm:np.float64(0.228707580612256) - perf/mfu/actor:np.float64(0.19402318764185014) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(107.39159393310547) - actor/lr:np.float64(1e-06) - training/global_step:138 - training/epoch:0 - critic/score/mean:0.6275510191917419 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6186521649360657 - critic/rewards/max:1.0013821125030518 - critic/rewards/min:-0.05792256444692612 - critic/advantages/mean:-0.16830836236476898 - critic/advantages/max:2.47480845451355 - critic/advantages/min:-2.474858283996582 - critic/returns/mean:-0.16830836236476898 - critic/returns/max:2.47480845451355 - critic/returns/min:-2.474858283996582 - response_length/mean:1135.228271484375 - response_length/max:8192.0 - response_length/min:120.0 - response_length/clip_ratio:0.012755102477967739 - response_length_non_aborted/mean:1135.228271484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:120.0 - response_length_non_aborted/clip_ratio:0.012755102477967739 - response/aborted_ratio:0.0 - prompt_length/mean:242.11224365234375 - prompt_length/max:886.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.800067007541656e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9671858558431268) - timing_s/agent_loop/generate_sequences/max:np.float64(30.239791645668447) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.591263804481059) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.239791645668447) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.94147031661123 - timing_s/reward:0.0001286529004573822 - timing_s/old_log_prob:9.944185223430395 - timing_s/ref:20.861767498776317 - timing_s/adv:0.10136693902313709 - timing_s/update_actor:22.641013053245842 - timing_s/update_weights:29.834191222675145 - timing_s/step:115.7782476618886 - timing_s/stop_profile:5.898158997297287e-05 - timing_per_token_ms/adv:9.387261852332727e-05 - timing_per_token_ms/update_actor:0.020967104282826398 - timing_per_token_ms/gen:0.03588852633102353 - timing_per_token_ms/ref:0.01931940296320856 - perf/total_num_tokens:1474763 - perf/time_per_step:115.7782476618886 - perf/throughput:3184.456125788852 - frontier/active_count:35.0 - frontier/completed_count:29.0 - frontier/blacklisted_count:1476.0 - frontier/mean_score:2.5436559532181007 - frontier/mean_frontier_pct:0.5902927127825622 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:11.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.8507607475921093 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.0480374099999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.341753861099999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.4663587127989997 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:2.673512645225033 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.9411207287592998 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:2.9976736948601275 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.55254337437 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.7340677308999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.9911831000529623 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.2449043821409993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:1.7222389347592995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.6992177692715096 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.911416926638699 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.5680630948601273 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.8388909108999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.9437151175500564 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.999130264297355 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.3903110989592995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9102338129999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.436233812999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.1789394099999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:2.9903110989592996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.2632101978589994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.560736706943039 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.0059719918589995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.6878699999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.866381719025951 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:1.8753915883012997 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:138.0 - cluster/prob_snapshot/cluster_0:0.04325338712033563 - cluster/prob_snapshot/cluster_1:0.02300442970575403 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02630357815868853 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.018950328864422286 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.016470766484682795 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.023162049741730167 - cluster/prob_snapshot/cluster_12:0.03003003432174796 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03477665326075294 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.021803495940596787 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.03367114949047623 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.028671255876416276 - cluster/prob_snapshot/cluster_21:0.019477788357638175 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03359824436127258 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.025215723503362084 - cluster/prob_snapshot/cluster_27:0.01934492227423864 - cluster/prob_snapshot/cluster_28:0.03031876523863334 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.03270227668009365 - cluster/prob_snapshot/cluster_34:0.022588451279638135 - cluster/prob_snapshot/cluster_35:0.0400780065103425 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.020655207023952286 - cluster/prob_snapshot/cluster_38:0.03306506373604071 - cluster/prob_snapshot/cluster_39:0.03368751029964244 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.026848993764666085 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03268898744308925 - cluster/prob_snapshot/cluster_44:0.027364817275451817 - cluster/prob_snapshot/cluster_45:0.02447477680129005 - cluster/prob_snapshot/cluster_46:0.033588449673067886 - cluster/prob_snapshot/cluster_47:0.036653768747196405 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.03999571339645073 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.022531932987704437 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.026206757024546723 - cluster/prob_snapshot/cluster_54:0.03019130225419324 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.024131365667267774 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04342884853404359 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.021065198200573718
[36m(TaskRunner pid=2823680)[0m Training Progress:  17%|█▋        | 139/800 [4:18:13<20:33:53, 112.00s/it]
[36m(TaskRunner pid=2823680)[0m step:139 - global_seqlen/min:308585 - global_seqlen/max:374364 - global_seqlen/minmax_diff:65779 - global_seqlen/balanced_min:346927 - global_seqlen/balanced_max:347020 - global_seqlen/mean:346984.0 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.2589388601869966) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012645148672163486 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04972134054150956) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006996780185014965) - actor/ppo_kl:np.float64(5.32881709460753e-05) - actor/pg_clipfrac_lower:np.float64(5.13098761227108e-06) - actor/grad_norm:np.float64(0.2400951844950517) - perf/mfu/actor:np.float64(0.19321904337550283) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(108.94087982177734) - actor/lr:np.float64(1e-06) - training/global_step:139 - training/epoch:0 - critic/score/mean:0.62890625 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6191586852073669 - critic/rewards/max:1.0089031457901 - critic/rewards/min:-0.05825801193714142 - critic/advantages/mean:-0.10808020085096359 - critic/advantages/max:2.4748504161834717 - critic/advantages/min:-2.4748523235321045 - critic/returns/mean:-0.10808020085096359 - critic/returns/max:2.4748504161834717 - critic/returns/min:-2.4748523235321045 - response_length/mean:1045.0482177734375 - response_length/max:8192.0 - response_length/min:97.0 - response_length/clip_ratio:0.0052083334885537624 - response_length_non_aborted/mean:1045.0482177734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:97.0 - response_length_non_aborted/clip_ratio:0.0052083334885537624 - response/aborted_ratio:0.0 - prompt_length/mean:232.8541717529297 - prompt_length/max:410.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.87853854894638e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8608773499727249) - timing_s/agent_loop/generate_sequences/max:np.float64(28.181467530317605) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.199158858217743) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.181467530317605) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.834526175633073 - timing_s/reward:0.00017067790031433105 - timing_s/old_log_prob:11.522489545866847 - timing_s/ref:19.448144171386957 - timing_s/adv:0.08023402839899063 - timing_s/update_actor:20.948098519816995 - timing_s/update_weights:26.271391549147666 - timing_s/step:109.5583567135036 - timing_s/stop_profile:5.1023438572883606e-05 - timing_per_token_ms/adv:8.175224942302564e-05 - timing_per_token_ms/update_actor:0.02134448698766492 - timing_per_token_ms/gen:0.038418441852677086 - timing_per_token_ms/ref:0.01981614989101296 - perf/total_num_tokens:1387936 - perf/time_per_step:109.5583567135036 - perf/throughput:3167.115776547902 - frontier/active_count:35.0 - frontier/completed_count:29.0 - frontier/blacklisted_count:1508.0 - frontier/mean_score:2.591807799890396 - frontier/mean_frontier_pct:0.6103811711288595 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:11.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.8507607475921093 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:1.7336261869999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:2.539227702769999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6871090890999996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.4663587127989997 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:2.771458851657523 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:2.25878451013151 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:3.5983715864020893 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.6867803620589994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.7340677308999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.9938281700370735 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.2449043821409993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:1.7222389347592995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.7894524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.911416926638699 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.0110037 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.5680630948601273 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:2.18722363763 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.9437151175500564 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.9993911850081485 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.3903110989592995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9102338129999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.436233812999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:1.8252575869999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.9932177692715096 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.2632101978589994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.392515694860127 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.3041803943012997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.7815089999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.866381719025951 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.2127741118109094 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:139.0 - cluster/prob_snapshot/cluster_0:0.04244980497787742 - cluster/prob_snapshot/cluster_1:0.019111053209085647 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02799179898268448 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.01859826057837584 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.016164764696132285 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02273173408798955 - cluster/prob_snapshot/cluster_12:0.030551855975637927 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0341305559399338 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.02490026470797753 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.039667531195288916 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.029618381889633946 - cluster/prob_snapshot/cluster_21:0.01911592067649605 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03300319866271035 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.024747253714855304 - cluster/prob_snapshot/cluster_27:0.018985523042831128 - cluster/prob_snapshot/cluster_28:0.030750212690573063 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.03209471812092834 - cluster/prob_snapshot/cluster_34:0.022168792212855584 - cluster/prob_snapshot/cluster_35:0.03933341810972928 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.024111395889358934 - cluster/prob_snapshot/cluster_38:0.03245076514519041 - cluster/prob_snapshot/cluster_39:0.033064523921818904 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.026350180298977666 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03208167577773398 - cluster/prob_snapshot/cluster_44:0.026856420593522452 - cluster/prob_snapshot/cluster_45:0.02012117440715856 - cluster/prob_snapshot/cluster_46:0.03299646976025315 - cluster/prob_snapshot/cluster_47:0.03597279747581147 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.037398228316638873 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.025400697364306772 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02571987534062038 - cluster/prob_snapshot/cluster_54:0.030662646249326995 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.023683041598001334 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.042622006585402945 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.02439305780428058
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 140/800 [4:20:07<20:38:12, 112.56s/it]
[36m(TaskRunner pid=2823680)[0m step:140 - global_seqlen/min:355655 - global_seqlen/max:405339 - global_seqlen/minmax_diff:49684 - global_seqlen/balanced_min:384868 - global_seqlen/balanced_max:385050 - global_seqlen/mean:384945.0 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.2615732405261666) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011068894527852535 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04720203668694012) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003562496634663594) - actor/ppo_kl:np.float64(5.596422686623697e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22674576823527998) - perf/mfu/actor:np.float64(0.19955146852292452) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(111.00555419921875) - actor/lr:np.float64(1e-06) - training/global_step:140 - training/epoch:0 - critic/score/mean:0.5428921580314636 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5337451100349426 - critic/rewards/max:1.015321969985962 - critic/rewards/min:-0.039853259921073914 - critic/advantages/mean:-0.1858631819486618 - critic/advantages/max:2.4748332500457764 - critic/advantages/min:-2.4748375415802 - critic/returns/mean:-0.1858631819486618 - critic/returns/max:2.4748332500457764 - critic/returns/min:-2.4748375415802 - response_length/mean:1152.7021484375 - response_length/max:8192.0 - response_length/min:186.0 - response_length/clip_ratio:0.011029412038624287 - response_length_non_aborted/mean:1152.7021484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:186.0 - response_length_non_aborted/clip_ratio:0.011029412038624287 - response/aborted_ratio:0.0 - prompt_length/mean:238.22549438476562 - prompt_length/max:455.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.516386151313782e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4371929476037621) - timing_s/agent_loop/generate_sequences/max:np.float64(29.198175940662622) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.835304650836406) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.198175940662622) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.1774416686967 - timing_s/reward:0.0001272093504667282 - timing_s/old_log_prob:11.227616620250046 - timing_s/ref:16.947780326008797 - timing_s/adv:0.12944225687533617 - timing_s/update_actor:23.150016299448907 - timing_s/update_weights:30.496831548400223 - timing_s/step:113.60861080512404 - timing_s/stop_profile:5.580298602581024e-05 - timing_per_token_ms/adv:0.0001140463427439334 - timing_per_token_ms/update_actor:0.020396544043243203 - timing_per_token_ms/gen:0.033146157705622126 - timing_per_token_ms/ref:0.014932004512795008 - perf/total_num_tokens:1539780 - perf/time_per_step:113.60861080512404 - perf/throughput:3388.3435179073413 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1534.0 - frontier/mean_score:2.6088142584519467 - frontier/mean_frontier_pct:0.6141466711145382 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.8507607475921093 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:1.7336261869999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:2.539227702769999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:2.0809763623699995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:2.771458851657523 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.096094438490056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:3.418860110481462 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.1807462534412996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.7340677308999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.9938281700370735 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.4714330674986993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:1.7222389347592995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.7894524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.9379918486470893 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.30770259 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.5680630948601273 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:2.18722363763 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.960600582285039 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:2.999573829505704 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.3903110989592995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.3371636690999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.436233812999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.1776803108999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.9932177692715096 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.1842471385012994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.2747609864020886 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:1.9129262760109098 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.7815089999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.866381719025951 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:144.0 - frontier/cluster_63/score:1.8489418782676366 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:140.0 - cluster/prob_snapshot/cluster_0:0.04472902555095901 - cluster/prob_snapshot/cluster_1:0.020137166419031027 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02949473837550732 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.024171858753258727 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.023952249377097876 - cluster/prob_snapshot/cluster_12:0.032192250288916625 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03596309828752644 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0397122260406603 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.025330749242556214 - cluster/prob_snapshot/cluster_21:0.02014229523114882 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03477521079347867 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.028707260738740595 - cluster/prob_snapshot/cluster_27:0.020004896270399232 - cluster/prob_snapshot/cluster_28:0.03240125719896396 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.034126636548068634 - cluster/prob_snapshot/cluster_34:0.02680542751887892 - cluster/prob_snapshot/cluster_35:0.041445313224725354 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.025405988163348926 - cluster/prob_snapshot/cluster_38:0.03438925131196925 - cluster/prob_snapshot/cluster_39:0.03484195026809812 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027764977683162075 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.027147636616301214 - cluster/prob_snapshot/cluster_44:0.028298399098912262 - cluster/prob_snapshot/cluster_45:0.025295136377655864 - cluster/prob_snapshot/cluster_46:0.03476812060189606 - cluster/prob_snapshot/cluster_47:0.03698704774313842 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.038038423427281254 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.022219850539999357 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.027100830307174088 - cluster/prob_snapshot/cluster_54:0.0323089891287115 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.024954634616424925 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04491047303009207 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.02147663122591676
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 141/800 [4:21:57<20:26:33, 111.68s/it]
[36m(TaskRunner pid=2823680)[0m step:141 - global_seqlen/min:322422 - global_seqlen/max:437298 - global_seqlen/minmax_diff:114876 - global_seqlen/balanced_min:373503 - global_seqlen/balanced_max:373551 - global_seqlen/mean:373531.0 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.2367002834022666) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011181033216416836 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.013207048814365407) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00029683376010325446) - actor/ppo_kl:np.float64(2.329560267181711e-05) - actor/pg_clipfrac_lower:np.float64(3.2405923775513656e-06) - actor/grad_norm:np.float64(0.24577412630120912) - perf/mfu/actor:np.float64(0.21201983857224083) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.83564758300781) - actor/lr:np.float64(1e-06) - training/global_step:141 - training/epoch:0 - critic/score/mean:0.5789473652839661 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5695717930793762 - critic/rewards/max:1.0121549367904663 - critic/rewards/min:-0.05542229115962982 - critic/advantages/mean:-0.10732074826955795 - critic/advantages/max:2.474842071533203 - critic/advantages/min:-2.474858522415161 - critic/returns/mean:-0.10732074826955795 - critic/returns/max:2.474842071533203 - critic/returns/min:-2.474858522415161 - response_length/mean:1137.67236328125 - response_length/max:8192.0 - response_length/min:157.0 - response_length/clip_ratio:0.007894736714661121 - response_length_non_aborted/mean:1137.67236328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:157.0 - response_length_non_aborted/clip_ratio:0.007894736714661121 - response/aborted_ratio:0.0 - prompt_length/mean:249.726318359375 - prompt_length/max:657.0 - prompt_length/min:187.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.387677371501923e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2236659163609147) - timing_s/agent_loop/generate_sequences/max:np.float64(27.941805575974286) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.664236251888724) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.941805575974286) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:275 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.50991841685027 - timing_s/reward:0.00014119595289230347 - timing_s/old_log_prob:9.73999404720962 - timing_s/ref:20.827991742640734 - timing_s/adv:0.07066246494650841 - timing_s/update_actor:20.68489324580878 - timing_s/update_weights:28.09981241170317 - timing_s/step:109.38359926640987 - timing_s/stop_profile:5.1584094762802124e-05 - timing_per_token_ms/adv:6.701529172496086e-05 - timing_per_token_ms/update_actor:0.019617262944576115 - timing_per_token_ms/gen:0.0341300721543066 - timing_per_token_ms/ref:0.019752975554062017 - perf/total_num_tokens:1494124 - perf/time_per_step:109.38359926640987 - perf/throughput:3414.872087818617 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1567.0 - frontier/mean_score:2.6118947567627675 - frontier/mean_frontier_pct:0.6413064144572321 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.8507607475921093 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:1.7336261869999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:2.539227702769999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:2.0809763623699995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:208.0 - frontier/cluster_12/score:2.840021196160266 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:3.067266106943039 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.8932020773370235 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.4265223774089097 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.7340677308999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.995679719025951 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.4714330674986993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:1.7222389347592995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.7894524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.9379918486470893 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.30770259 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.3976441664020887 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:2.18722363763 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.972420407599527 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:2.9997016806539927 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.3903110989592995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:1.9360145683699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.605363669099999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.4243762176299994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.9952524384900565 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.1289729969509095 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.2747609864020886 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:1.6390483932076367 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:2.8470562999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.866381719025951 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:160.0 - frontier/cluster_63/score:1.5942593147873456 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:141.0 - cluster/prob_snapshot/cluster_0:0.04467627163072634 - cluster/prob_snapshot/cluster_1:0.020113416416479075 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0294599519464184 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.024143350188789957 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.023923999822419047 - cluster/prob_snapshot/cluster_12:0.03294973817213021 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03558621860451742 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.045168673132751275 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.028152352212207445 - cluster/prob_snapshot/cluster_21:0.020118539179617666 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.034755678064275805 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.028673403069711237 - cluster/prob_snapshot/cluster_27:0.019981302268761318 - cluster/prob_snapshot/cluster_28:0.03236304278859714 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.034086387205721234 - cluster/prob_snapshot/cluster_34:0.026773812893527368 - cluster/prob_snapshot/cluster_35:0.03941924301000776 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.025376024052651398 - cluster/prob_snapshot/cluster_38:0.034485825070713336 - cluster/prob_snapshot/cluster_39:0.0348023405972039 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027732231353459664 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.02246151303781373 - cluster/prob_snapshot/cluster_44:0.03022725705571849 - cluster/prob_snapshot/cluster_45:0.028127452608329044 - cluster/prob_snapshot/cluster_46:0.03475072078374465 - cluster/prob_snapshot/cluster_47:0.036302137863121794 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.037993560479105175 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01901613110516895 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.027068867280422362 - cluster/prob_snapshot/cluster_54:0.0330313589818081 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.02492520283722172 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.044857505108643794 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.018496491178215276
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 142/800 [4:23:45<20:15:07, 110.80s/it]
[36m(TaskRunner pid=2823680)[0m step:142 - global_seqlen/min:328726 - global_seqlen/max:446529 - global_seqlen/minmax_diff:117803 - global_seqlen/balanced_min:378552 - global_seqlen/balanced_max:378691 - global_seqlen/mean:378652.25 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.20007825345593563) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012239211238920689 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.002234211700852029) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005827422963270789) - actor/ppo_kl:np.float64(4.345609849504863e-05) - actor/pg_clipfrac_lower:np.float64(4.3364982164038863e-07) - actor/grad_norm:np.float64(0.21810605997840563) - perf/mfu/actor:np.float64(0.20453540749635563) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.76338195800781) - actor/lr:np.float64(1e-06) - training/global_step:142 - training/epoch:0 - critic/score/mean:0.6088709831237793 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5988412499427795 - critic/rewards/max:1.0000890493392944 - critic/rewards/min:-0.04445302113890648 - critic/advantages/mean:-0.14955778419971466 - critic/advantages/max:2.4746146202087402 - critic/advantages/min:-2.474858283996582 - critic/returns/mean:-0.14955778419971466 - critic/returns/max:2.4746146202087402 - critic/returns/min:-2.474858283996582 - response_length/mean:1075.47314453125 - response_length/max:8192.0 - response_length/min:180.0 - response_length/clip_ratio:0.006720430217683315 - response_length_non_aborted/mean:1075.47314453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:180.0 - response_length_non_aborted/clip_ratio:0.006720430217683315 - response/aborted_ratio:0.0 - prompt_length/mean:241.65591430664062 - prompt_length/max:657.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010622944682836533 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.669017044827342) - timing_s/agent_loop/generate_sequences/max:np.float64(28.758560822345316) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.103776474150436) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.758560822345316) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:186 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.528341074474156 - timing_s/reward:0.00025888439267873764 - timing_s/old_log_prob:9.193528975360096 - timing_s/ref:20.062054730020463 - timing_s/adv:0.10339648183435202 - timing_s/update_actor:21.65462251100689 - timing_s/update_weights:26.597836885601282 - timing_s/step:108.54620327614248 - timing_s/stop_profile:6.185285747051239e-05 - timing_per_token_ms/adv:0.0001055126434105949 - timing_per_token_ms/update_actor:0.022097816315020953 - timing_per_token_ms/gen:0.038153177239417205 - timing_per_token_ms/ref:0.020472654284347334 - perf/total_num_tokens:1514609 - perf/time_per_step:108.54620327614248 - perf/throughput:3488.3970011987008 - frontier/active_count:32.0 - frontier/completed_count:32.0 - frontier/blacklisted_count:1602.0 - frontier/mean_score:2.495447726400394 - frontier/mean_frontier_pct:0.6557631258481053 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.595532523314476 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:1.7336261869999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:2.6774593919389993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:2.0809763623699995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:224.0 - frontier/cluster_12/score:2.288014837312186 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:3.047086274860127 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.6252414541359164 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.4265223774089097 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:1.7340677308999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:160.0 - frontier/cluster_23/score:2.3969758033181656 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:128.0 - frontier/cluster_26/score:2.6300031472490892 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:1.7222389347592995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.7894524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:2.9565942940529624 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.30770259 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.278350916481462 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:2.18722363763 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.972420407599527 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:2.9997911764577947 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.5732177692715092 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:1.9360145683699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.123754568369999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.4243762176299994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.9952524384900565 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.0902810978656365 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.1923326904814617 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:1.6390483932076367 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.2929394099999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.1483662178589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:160.0 - frontier/cluster_63/score:1.5942593147873456 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:142.0 - cluster/prob_snapshot/cluster_0:0.045026145074035975 - cluster/prob_snapshot/cluster_1:0.021709859024735784 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03352929621121977 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.026059656804699724 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.025822896155373386 - cluster/prob_snapshot/cluster_12:0.028652358817046034 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.038158060808884564 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.04539818415878539 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03038686144045548 - cluster/prob_snapshot/cluster_21:0.021715388391962763 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03001685551704244 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.03293501101306862 - cluster/prob_snapshot/cluster_27:0.021567258709466835 - cluster/prob_snapshot/cluster_28:0.034931763058228774 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.03702484757011117 - cluster/prob_snapshot/cluster_34:0.02889890466330251 - cluster/prob_snapshot/cluster_35:0.04105414233133403 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.027390170490379822 - cluster/prob_snapshot/cluster_38:0.037223034870569494 - cluster/prob_snapshot/cluster_39:0.037565793613929206 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03222389891762149 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.024244328831857564 - cluster/prob_snapshot/cluster_44:0.02659535984642535 - cluster/prob_snapshot/cluster_45:0.03035998550457375 - cluster/prob_snapshot/cluster_46:0.03750895589298988 - cluster/prob_snapshot/cluster_47:0.038698981063250815 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.03997695304218893 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.020525479955303367 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.029217378114628864 - cluster/prob_snapshot/cluster_54:0.028714028270132978 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.026903566681773766 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0199645951546215
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 15:56:13,206:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 143/800 [4:25:29<19:48:30, 108.54s/it]
[36m(TaskRunner pid=2823680)[0m step:143 - global_seqlen/min:376413 - global_seqlen/max:412660 - global_seqlen/minmax_diff:36247 - global_seqlen/balanced_min:393438 - global_seqlen/balanced_max:393516 - global_seqlen/mean:393474.5 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.2285766020531253) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010959211736917496 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0004772925531142391) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00038109460432196455) - actor/ppo_kl:np.float64(3.2224463777719376e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.20329233087026155) - perf/mfu/actor:np.float64(0.21562844905335943) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.24011421203613) - actor/lr:np.float64(1e-06) - training/global_step:143 - training/epoch:0 - critic/score/mean:0.5051020383834839 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4945545494556427 - critic/rewards/max:1.0012001991271973 - critic/rewards/min:-0.07796483486890793 - critic/advantages/mean:-0.1266014575958252 - critic/advantages/max:2.4748194217681885 - critic/advantages/min:-2.474851131439209 - critic/returns/mean:-0.1266014575958252 - critic/returns/max:2.4748194217681885 - critic/returns/min:-2.474851131439209 - response_length/mean:1203.2958984375 - response_length/max:8192.0 - response_length/min:195.0 - response_length/clip_ratio:0.010204081423580647 - response_length_non_aborted/mean:1203.2958984375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:195.0 - response_length_non_aborted/clip_ratio:0.010204081423580647 - response/aborted_ratio:0.0 - prompt_length/mean:244.80612182617188 - prompt_length/max:886.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.768495172262192e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4867480751127005) - timing_s/agent_loop/generate_sequences/max:np.float64(28.393064769916236) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.162724869712292) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.393064769916236) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:234 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.872982954606414 - timing_s/reward:0.00013703294098377228 - timing_s/old_log_prob:10.245845135301352 - timing_s/ref:17.23149184882641 - timing_s/adv:0.07408066466450691 - timing_s/update_actor:21.517822508700192 - timing_s/update_weights:22.701988708227873 - timing_s/step:103.04168694186956 - timing_s/stop_profile:0.00010433141142129898 - timing_per_token_ms/adv:6.525137113366803e-05 - timing_per_token_ms/update_actor:0.0189532238791629 - timing_per_token_ms/gen:0.03272578605807011 - timing_per_token_ms/ref:0.015177758932193449 - perf/total_num_tokens:1573898 - perf/time_per_step:103.04168694186956 - perf/throughput:3818.59528582812 - frontier/active_count:32.0 - frontier/completed_count:32.0 - frontier/blacklisted_count:1632.0 - frontier/mean_score:2.511212088433341 - frontier/mean_frontier_pct:0.6803151848452217 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.595532523314476 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.1135383308999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.7742215743572993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.7566834536589997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:224.0 - frontier/cluster_12/score:2.50161038611853 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:3.047086274860127 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:176.0 - frontier/cluster_18/score:3.437669017895141 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.4265223774089097 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:1.5138474116299998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:160.0 - frontier/cluster_23/score:2.3969758033181656 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:128.0 - frontier/cluster_26/score:2.6300031472490892 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.1055672543315094 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.7894524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:2.9696160058370733 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.30770259 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.278350916481462 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.431056546341 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.972420407599527 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:2.9997911764577947 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.5732177692715092 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:1.9360145683699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:1.7866281978589993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:1.9970633523409995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.9952524384900565 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.063196768505945 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.1923326904814617 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.0473338752453456 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:1.9050575869999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.4038563525012995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:160.0 - frontier/cluster_63/score:2.0159815203511418 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:143.0 - cluster/prob_snapshot/cluster_0:0.04474348935763334 - cluster/prob_snapshot/cluster_1:0.02630127226005435 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03452293997706553 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.021860502416222314 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.025660790578704844 - cluster/prob_snapshot/cluster_12:0.031130514593442786 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03791852011543332 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.04277900592468208 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.030196105157065976 - cluster/prob_snapshot/cluster_21:0.018838604605057936 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.029828421979452815 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0327282584892334 - cluster/prob_snapshot/cluster_27:0.026202078669870288 - cluster/prob_snapshot/cluster_28:0.034712475742021805 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.03695446537942702 - cluster/prob_snapshot/cluster_34:0.02871748916376494 - cluster/prob_snapshot/cluster_35:0.040796421223011775 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.030252529216101234 - cluster/prob_snapshot/cluster_38:0.03698936388739469 - cluster/prob_snapshot/cluster_39:0.03732997093160276 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03202161046457116 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.02409213285497787 - cluster/prob_snapshot/cluster_44:0.02223314049827049 - cluster/prob_snapshot/cluster_45:0.024851835513260284 - cluster/prob_snapshot/cluster_46:0.037273490014620436 - cluster/prob_snapshot/cluster_47:0.03811900215705407 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.03972599408749373 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.02547741144449948 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.029033963369066745 - cluster/prob_snapshot/cluster_54:0.02370689830140576 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.029914044839809095 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.025087256787727698
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 144/800 [4:27:17<19:44:11, 108.31s/it]
[36m(TaskRunner pid=2823680)[0m step:144 - global_seqlen/min:317623 - global_seqlen/max:463630 - global_seqlen/minmax_diff:146007 - global_seqlen/balanced_min:385102 - global_seqlen/balanced_max:385177 - global_seqlen/mean:385155.25 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.256318740926835) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01376675721257925 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.04862381710609043) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000414246183750873) - actor/ppo_kl:np.float64(3.812976181934016e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23590468557981345) - perf/mfu/actor:np.float64(0.21315979788991662) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(107.2767219543457) - actor/lr:np.float64(1e-06) - training/global_step:144 - training/epoch:0 - critic/score/mean:0.5966494679450989 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5859355926513672 - critic/rewards/max:1.0067329406738281 - critic/rewards/min:-0.10491904616355896 - critic/advantages/mean:-0.10705755650997162 - critic/advantages/max:2.474860906600952 - critic/advantages/min:-2.474848985671997 - critic/returns/mean:-0.10705755650997162 - critic/returns/max:2.474860906600952 - critic/returns/min:-2.474848985671997 - response_length/mean:1085.2061767578125 - response_length/max:8192.0 - response_length/min:155.0 - response_length/clip_ratio:0.005154639016836882 - response_length_non_aborted/mean:1085.2061767578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:155.0 - response_length_non_aborted/clip_ratio:0.005154639016836882 - response/aborted_ratio:0.0 - prompt_length/mean:226.83505249023438 - prompt_length/max:370.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010521057993173599 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2763127889484167) - timing_s/agent_loop/generate_sequences/max:np.float64(28.356725755147636) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.787255671179992) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.356725755147636) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.476669271476567 - timing_s/reward:0.00019993353635072708 - timing_s/old_log_prob:9.66363909933716 - timing_s/ref:20.28796022478491 - timing_s/adv:0.08705294225364923 - timing_s/update_actor:21.224144659936428 - timing_s/update_weights:25.407401631586254 - timing_s/step:107.53606270626187 - timing_s/stop_profile:5.0129368901252747e-05 - timing_per_token_ms/adv:8.550160120145012e-05 - timing_per_token_ms/update_actor:0.020845916353616413 - timing_per_token_ms/gen:0.03619041142767844 - timing_per_token_ms/ref:0.019926415344769414 - perf/total_num_tokens:1540621 - perf/time_per_step:107.53606270626187 - perf/throughput:3581.6380133989437 - frontier/active_count:30.0 - frontier/completed_count:34.0 - frontier/blacklisted_count:1663.0 - frontier/mean_score:2.4508644200533753 - frontier/mean_frontier_pct:0.6939606850434579 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.1135383308999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.8419551020501093 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.7566834536589997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:240.0 - frontier/cluster_12/score:2.051127270282971 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.0329603924020887 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:176.0 - frontier/cluster_18/score:3.3063683125265984 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:2.5985656641862365 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:1.359693188141 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:1.977883062322716 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:128.0 - frontier/cluster_26/score:2.6300031472490892 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.1055672543315094 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.978731204085951 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.515391813 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.278350916481462 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.0017395824386996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.972420407599527 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:2.9997911764577947 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:2.701252438490056 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:1.9360145683699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:1.7866281978589993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:1.9970633523409995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:2.9952524384900565 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.063196768505945 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.134632883337023 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.0473338752453456 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.2335403108999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.4038563525012995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:176.0 - frontier/cluster_63/score:2.3111870642457992 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:144.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.028745481439210886 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03865241828960109 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.02389202545960964 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.028045479017223537 - cluster/prob_snapshot/cluster_12:0.027896650851026418 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.04125021315725598 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.044968736818913765 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03534216530467571 - cluster/prob_snapshot/cluster_21:0.018492702370855593 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.026900482487448807 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.035769735305497 - cluster/prob_snapshot/cluster_27:0.02863706966820104 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.040512661297696755 - cluster/prob_snapshot/cluster_34:0.03421094740313732 - cluster/prob_snapshot/cluster_35:0.0445876822024841 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.027224946513566637 - cluster/prob_snapshot/cluster_38:0.040426830404253766 - cluster/prob_snapshot/cluster_39:0.040799090474813844 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03673877151789065 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.02633104402578957 - cluster/prob_snapshot/cluster_44:0.02429929325126425 - cluster/prob_snapshot/cluster_45:0.027161346774913362 - cluster/prob_snapshot/cluster_46:0.04073736071760925 - cluster/prob_snapshot/cluster_47:0.04166144733047881 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.04263302446392576 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.027845058237326713 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03173212485237063 - cluster/prob_snapshot/cluster_54:0.030377585592860027 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.03269399336321983 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.031433631406879506
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 15:59:59,380:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 145/800 [4:29:11<20:01:35, 110.07s/it]
[36m(TaskRunner pid=2823680)[0m step:145 - global_seqlen/min:351534 - global_seqlen/max:455901 - global_seqlen/minmax_diff:104367 - global_seqlen/balanced_min:401028 - global_seqlen/balanced_max:401091 - global_seqlen/mean:401065.25 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.23173982786635558) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010297277010977268 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04096457897685468) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00038757218135287985) - actor/ppo_kl:np.float64(5.6339147264831505e-05) - actor/pg_clipfrac_lower:np.float64(1.7470300210536353e-06) - actor/grad_norm:np.float64(0.2626676733295123) - perf/mfu/actor:np.float64(0.22807336987446922) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.69175720214844) - actor/lr:np.float64(1e-06) - training/global_step:145 - training/epoch:0 - critic/score/mean:0.5569444298744202 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5471972823143005 - critic/rewards/max:1.0097600221633911 - critic/rewards/min:-0.05431322008371353 - critic/advantages/mean:-0.13336893916130066 - critic/advantages/max:2.4747743606567383 - critic/advantages/min:-2.474850654602051 - critic/returns/mean:-0.13336893916130066 - critic/returns/max:2.4747743606567383 - critic/returns/min:-2.474850654602051 - response_length/mean:1245.9000244140625 - response_length/max:8192.0 - response_length/min:183.0 - response_length/clip_ratio:0.012500000186264515 - response_length_non_aborted/mean:1245.9000244140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:183.0 - response_length_non_aborted/clip_ratio:0.012500000186264515 - response/aborted_ratio:0.0 - prompt_length/mean:239.7888946533203 - prompt_length/max:411.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.553639054298401e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6048729894682765) - timing_s/agent_loop/generate_sequences/max:np.float64(29.886783197522163) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.461187475879342) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.886783197522163) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.789306661114097 - timing_s/reward:0.0002628955990076065 - timing_s/old_log_prob:9.504762531258166 - timing_s/ref:21.661593086086214 - timing_s/adv:0.08846413344144821 - timing_s/update_actor:20.658438520506024 - timing_s/update_weights:29.795610619708896 - timing_s/step:113.9743579076603 - timing_s/stop_profile:6.164982914924622e-05 - timing_per_token_ms/adv:8.270025637325764e-05 - timing_per_token_ms/update_actor:0.01931243878681983 - timing_per_token_ms/gen:0.035437687460552945 - timing_per_token_ms/ref:0.02025023285689225 - perf/total_num_tokens:1604261 - perf/time_per_step:113.9743579076603 - perf/throughput:3518.907738220687 - frontier/active_count:30.0 - frontier/completed_count:34.0 - frontier/blacklisted_count:1701.0 - frontier/mean_score:2.4581826695325097 - frontier/mean_frontier_pct:0.7152836926019742 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:1.7794768316299996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.8419551020501093 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.7566834536589997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.06207 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:240.0 - frontier/cluster_12/score:2.3357890891980793 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.0329603924020887 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:3.2144578187686186 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:2.7189959649303654 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:1.8517852316986998 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.2845181436259008 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.141002203074362 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:144.0 - frontier/cluster_27/score:2.3738970780320563 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.9851118428601655 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.515391813 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.278350916481462 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.0017395824386996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.972420407599527 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:224.0 - frontier/cluster_39/score:2.399853823520456 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:2.790876706943039 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.2552101978589993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:1.7866281978589993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:1.9970633523409995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:2.9966767069430396 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.044237737954161 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.134632883337023 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:1.7331337126717419 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.2335403108999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.5826994467509095 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:176.0 - frontier/cluster_63/score:2.3111870642457992 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:145.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.024129978265182053 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03853734627175591 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.023820896570352395 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02796198489176503 - cluster/prob_snapshot/cluster_12:0.03167365764620323 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.041127407250805714 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.04358854013861028 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03687000154803327 - cluster/prob_snapshot/cluster_21:0.025110491240138623 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.030978415774940003 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.029032317649546797 - cluster/prob_snapshot/cluster_27:0.03219040780891896 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.040478573593663074 - cluster/prob_snapshot/cluster_34:0.034109098077162975 - cluster/prob_snapshot/cluster_35:0.044454940325284145 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.027143895193373145 - cluster/prob_snapshot/cluster_38:0.040306475788538715 - cluster/prob_snapshot/cluster_39:0.032542385251579015 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03784471541427799 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.030580995543452824 - cluster/prob_snapshot/cluster_44:0.024226951886082784 - cluster/prob_snapshot/cluster_45:0.027080484797342794 - cluster/prob_snapshot/cluster_46:0.04063539492114499 - cluster/prob_snapshot/cluster_47:0.04128032978299243 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.042506101793391926 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.023501558477227194 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03163765522281423 - cluster/prob_snapshot/cluster_54:0.03028714855874629 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.035021799895260025 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.03134005042141334
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 146/800 [4:31:04<20:09:10, 110.93s/it]
[36m(TaskRunner pid=2823680)[0m step:146 - global_seqlen/min:369570 - global_seqlen/max:418450 - global_seqlen/minmax_diff:48880 - global_seqlen/balanced_min:388188 - global_seqlen/balanced_max:388240 - global_seqlen/mean:388203.0 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.2163318942355759) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0117671312764287 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.040987463115016) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00033919821856340834) - actor/ppo_kl:np.float64(1.5196934608705787e-05) - actor/pg_clipfrac_lower:np.float64(1.723857327734175e-07) - actor/grad_norm:np.float64(0.22953197016165808) - perf/mfu/actor:np.float64(0.20013122478671488) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(106.07870483398438) - actor/lr:np.float64(1e-06) - training/global_step:146 - training/epoch:0 - critic/score/mean:0.5853960514068604 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5752756595611572 - critic/rewards/max:1.0097037553787231 - critic/rewards/min:-0.06581626832485199 - critic/advantages/mean:-0.1268831044435501 - critic/advantages/max:2.4748220443725586 - critic/advantages/min:-2.4748475551605225 - critic/returns/mean:-0.1268831044435501 - critic/returns/max:2.4748220443725586 - critic/returns/min:-2.4748475551605225 - response_length/mean:1195.1671142578125 - response_length/max:8192.0 - response_length/min:120.0 - response_length/clip_ratio:0.011138614267110825 - response_length_non_aborted/mean:1195.1671142578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:120.0 - response_length_non_aborted/clip_ratio:0.011138614267110825 - response/aborted_ratio:0.0 - prompt_length/mean:247.93069458007812 - prompt_length/max:657.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.529331535100937e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9859206676483154) - timing_s/agent_loop/generate_sequences/max:np.float64(29.299635108560324) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.901533304657278) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.299635108560324) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.02903289720416 - timing_s/reward:0.0001697847619652748 - timing_s/old_log_prob:10.918873446062207 - timing_s/ref:20.397065851837397 - timing_s/adv:0.11149836611002684 - timing_s/update_actor:22.840665061958134 - timing_s/update_weights:26.981671035289764 - timing_s/step:112.67162483464926 - timing_s/stop_profile:5.6084245443344116e-05 - timing_per_token_ms/adv:9.562278455058505e-05 - timing_per_token_ms/update_actor:0.0195885201766673 - timing_per_token_ms/gen:0.03213129704223814 - timing_per_token_ms/ref:0.017492850357014738 - perf/total_num_tokens:1552812 - perf/time_per_step:112.67162483464926 - perf/throughput:3445.437132638369 - frontier/active_count:28.0 - frontier/completed_count:36.0 - frontier/blacklisted_count:1727.0 - frontier/mean_score:2.470377413341486 - frontier/mean_frontier_pct:0.7306977278208375 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:1.7794768316299996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.8419551020501093 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.7566834536589997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.3434489999999997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.0329603924020887 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:208.0 - frontier/cluster_18/score:3.7501204731380327 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:1.5962496621890898 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.2845181436259008 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.141002203074362 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:144.0 - frontier/cluster_27/score:2.3738970780320563 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:2.9895782900021155 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.515391813 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.194845641537023 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.0017395824386996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:2.380694285319669 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:224.0 - frontier/cluster_39/score:2.579897676464319 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:2.853613694860127 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.4786471385012994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:1.7866281978589993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:1.9970633523409995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:2.9976736948601275 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.0309664165679124 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.134632883337023 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:176.0 - frontier/cluster_51/score:1.5131935988702192 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.4634782176299996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.707889612725636 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:176.0 - frontier/cluster_63/score:2.3111870642457992 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:146.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.02572592497144915 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.04108619029369309 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.02539640074213356 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.033879279615680255 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.04384755682648681 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0542155515660595 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.023077006856876053 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03302731528400946 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.030952503039641043 - cluster/prob_snapshot/cluster_27:0.03431946796601692 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.043220381079318956 - cluster/prob_snapshot/cluster_34:0.03636505960898655 - cluster/prob_snapshot/cluster_35:0.04618793445838599 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.028939181109217976 - cluster/prob_snapshot/cluster_38:0.03441773530031895 - cluster/prob_snapshot/cluster_39:0.037297621907188286 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0412547387561243 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03583384126298088 - cluster/prob_snapshot/cluster_44:0.025829312387222404 - cluster/prob_snapshot/cluster_45:0.02887157677602052 - cluster/prob_snapshot/cluster_46:0.04333741647662725 - cluster/prob_snapshot/cluster_47:0.0438187298860102 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.045317437651547374 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.021876264023147792 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03373015655422741 - cluster/prob_snapshot/cluster_54:0.035614543931711075 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0391480033736292 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.03341286829528896
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 16:03:26,535:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 147/800 [4:32:54<20:04:50, 110.71s/it]
[36m(TaskRunner pid=2823680)[0m step:147 - global_seqlen/min:335644 - global_seqlen/max:434358 - global_seqlen/minmax_diff:98714 - global_seqlen/balanced_min:386739 - global_seqlen/balanced_max:386796 - global_seqlen/mean:386767.0 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.2364285108115938) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01090242713689804 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06668801957857795) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006104016507581238) - actor/ppo_kl:np.float64(4.83485733310671e-05) - actor/pg_clipfrac_lower:np.float64(6.110806942969146e-06) - actor/grad_norm:np.float64(0.24072483430306116) - perf/mfu/actor:np.float64(0.22121039882966276) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.18608093261719) - actor/lr:np.float64(1e-06) - training/global_step:147 - training/epoch:0 - critic/score/mean:0.5463483333587646 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5368024706840515 - critic/rewards/max:1.0313570499420166 - critic/rewards/min:-0.06286446005105972 - critic/advantages/mean:-0.12767496705055237 - critic/advantages/max:2.4748003482818604 - critic/advantages/min:-2.4748520851135254 - critic/returns/mean:-0.12767496705055237 - critic/returns/max:2.4748003482818604 - critic/returns/min:-2.4748520851135254 - response_length/mean:1186.8482666015625 - response_length/max:8192.0 - response_length/min:210.0 - response_length/clip_ratio:0.012640449218451977 - response_length_non_aborted/mean:1186.8482666015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:210.0 - response_length_non_aborted/clip_ratio:0.012640449218451977 - response/aborted_ratio:0.0 - prompt_length/mean:245.20223999023438 - prompt_length/max:657.0 - prompt_length/min:188.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.858367800712585e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.458818256855011) - timing_s/agent_loop/generate_sequences/max:np.float64(28.971102046780288) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.074784413904126) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.971102046780288) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:209 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.720468780957162 - timing_s/reward:0.0002170642837882042 - timing_s/old_log_prob:10.022647705860436 - timing_s/ref:20.47881031781435 - timing_s/adv:0.06864785868674517 - timing_s/update_actor:20.530270118266344 - timing_s/update_weights:27.71049427986145 - timing_s/step:109.94366153981537 - timing_s/stop_profile:6.263703107833862e-05 - timing_per_token_ms/adv:6.732690481428881e-05 - timing_per_token_ms/update_actor:0.020135217157633573 - timing_per_token_ms/gen:0.03635403554518052 - timing_per_token_ms/ref:0.020084747570481502 - perf/total_num_tokens:1547068 - perf/time_per_step:109.94366153981537 - perf/throughput:3517.8653738026987 - frontier/active_count:26.0 - frontier/completed_count:38.0 - frontier/blacklisted_count:1764.0 - frontier/mean_score:2.4192711249113605 - frontier/mean_frontier_pct:0.7333093309721641 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:1.7794768316299996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:2.8893685714350763 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.5296784175612999 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.3434489999999997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.023072274681462 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:208.0 - frontier/cluster_18/score:3.7501204731380327 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:1.4173747635323628 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.2845181436259008 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.141002203074362 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:144.0 - frontier/cluster_27/score:2.3738970780320563 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:2.9895782900021155 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.6607742691 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.0017395824386996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:224.0 - frontier/cluster_39/score:2.579897676464319 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:2.8975295864020887 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.6350529969509093 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.150639738501299 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:1.9970633523409995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.398371586402089 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.0216764915975385 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.094243018335916 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:1.3592355192091534 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.6244347523409997 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.707889612725636 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:192.0 - frontier/cluster_63/score:1.9178309449720594 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:147.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.028290097747378954 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.04593514108257804 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.024318806058983913 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03725611938159936 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.04806076068424604 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.059619362760875295 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.022533404140063593 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.036319237452300576 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.034037623039377296 - cluster/prob_snapshot/cluster_27:0.037740182499722504 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.04752827378490273 - cluster/prob_snapshot/cluster_34:0.042300952065556965 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.031823627846908155 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0410151771285253 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.046064884784227204 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0418920356411615 - cluster/prob_snapshot/cluster_44:0.03419084044261943 - cluster/prob_snapshot/cluster_45:0.031749285206306314 - cluster/prob_snapshot/cluster_46:0.03812927789101964 - cluster/prob_snapshot/cluster_47:0.048038570544326004 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.049192232170106216 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.021609107248063184 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.037092132819811156 - cluster/prob_snapshot/cluster_54:0.04172322693706347 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.04304999113865895 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.030489649503618203
[36m(TaskRunner pid=2823680)[0m Training Progress:  18%|█▊        | 148/800 [4:34:43<19:58:12, 110.27s/it]
[36m(TaskRunner pid=2823680)[0m step:148 - global_seqlen/min:362417 - global_seqlen/max:449416 - global_seqlen/minmax_diff:86999 - global_seqlen/balanced_min:397938 - global_seqlen/balanced_max:398165 - global_seqlen/mean:398049.5 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.26849094753464064) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01021355576813221 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02576282503287075) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00031502587783810063) - actor/ppo_kl:np.float64(1.6809102836153517e-05) - actor/pg_clipfrac_lower:np.float64(4.4193426826192684e-07) - actor/grad_norm:np.float64(0.22755657508969307) - perf/mfu/actor:np.float64(0.22629848999299487) - perf/max_memory_allocated_gb:np.float64(78.10701704025269) - perf/max_memory_reserved_gb:np.float64(84.439453125) - perf/cpu_memory_used_gb:np.float64(105.6755599975586) - actor/lr:np.float64(1e-06) - training/global_step:148 - training/epoch:0 - critic/score/mean:0.5666666626930237 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5567597150802612 - critic/rewards/max:1.0066193342208862 - critic/rewards/min:-0.05863199383020401 - critic/advantages/mean:-0.12352364510297775 - critic/advantages/max:2.474807024002075 - critic/advantages/min:-2.4748408794403076 - critic/returns/mean:-0.12352364510297775 - critic/returns/max:2.474807024002075 - critic/returns/min:-2.4748408794403076 - response_length/mean:1183.38330078125 - response_length/max:8192.0 - response_length/min:153.0 - response_length/clip_ratio:0.0027777778450399637 - response_length_non_aborted/mean:1183.38330078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:153.0 - response_length_non_aborted/clip_ratio:0.0027777778450399637 - response/aborted_ratio:0.0 - prompt_length/mean:242.96665954589844 - prompt_length/max:572.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.580926805734634e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7225385820493102) - timing_s/agent_loop/generate_sequences/max:np.float64(28.68285798560828) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.759258512647648) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.68285798560828) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:205 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.111347924917936 - timing_s/reward:0.00011616107076406479 - timing_s/old_log_prob:9.533268590457737 - timing_s/ref:21.016380790621042 - timing_s/adv:0.07303070649504662 - timing_s/update_actor:20.513752567581832 - timing_s/update_weights:27.408725204877555 - timing_s/step:109.03326671663672 - timing_s/stop_profile:5.1555223762989044e-05 - timing_per_token_ms/adv:7.111265593905834e-05 - timing_per_token_ms/update_actor:0.019974987212486642 - timing_per_token_ms/gen:0.03534046439929526 - timing_per_token_ms/ref:0.020464414600029058 - perf/total_num_tokens:1592198 - perf/time_per_step:109.03326671663672 - perf/throughput:3650.716079473973 - frontier/active_count:24.0 - frontier/completed_count:40.0 - frontier/blacklisted_count:1801.0 - frontier/mean_score:2.3269054540181866 - frontier/mean_frontier_pct:0.7445553168382034 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:1.5456337821409998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:2.9225580000045532 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:144.0 - frontier/cluster_6/score:1.37077489229291 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.5404142999999997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.023072274681462 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:1.892162334472654 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.8991627005381304 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.141002203074362 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:1.9617279546224393 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:2.9895782900021155 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:2.6607742691 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.0017395824386996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:240.0 - frontier/cluster_39/score:2.1059283735250234 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:2.928270710481462 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.144537097865636 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.4054478169509093 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.2979443466386993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.398371586402089 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.094243018335916 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:1.8514648634464073 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.3331340732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.6244347523409997 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:2.795522728907945 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:192.0 - frontier/cluster_63/score:1.9178309449720594 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:148.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.027676847582267506 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.05233270212586728 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.024545741819279855 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.045489856775463994 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.054132558141062215 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.033881951298687235 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.034007303155545524 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.03833779536424654 - cluster/prob_snapshot/cluster_27:0.035127626107363155 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.05353279905219901 - cluster/prob_snapshot/cluster_34:0.0476450791562616 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.035844093188624214 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03770974682792455 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.05243499661436637 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03840109285779764 - cluster/prob_snapshot/cluster_44:0.04307308326596548 - cluster/prob_snapshot/cluster_45:0.041148075416904094 - cluster/prob_snapshot/cluster_46:0.042946372943882975 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.05540697504835342 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.03315320318539424 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.04177819925980907 - cluster/prob_snapshot/cluster_54:0.046994366627735644 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.05005794863876207 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0343415855462362
[36m(TaskRunner pid=2823680)[0m Training Progress:  19%|█▊        | 149/800 [4:36:43<20:26:27, 113.04s/it]
[36m(TaskRunner pid=2823680)[0m step:149 - global_seqlen/min:411928 - global_seqlen/max:475526 - global_seqlen/minmax_diff:63598 - global_seqlen/balanced_min:444891 - global_seqlen/balanced_max:444991 - global_seqlen/mean:444950.75 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.20705675773322582) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0100906603038311 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04216858104337007) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004480666770541575) - actor/ppo_kl:np.float64(3.9898173863832656e-05) - actor/pg_clipfrac_lower:np.float64(6.874867176520638e-07) - actor/grad_norm:np.float64(0.21130168323333448) - perf/mfu/actor:np.float64(0.24268340888314574) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(105.98424530029297) - actor/lr:np.float64(1e-06) - training/global_step:149 - training/epoch:0 - critic/score/mean:0.5237500071525574 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.513343334197998 - critic/rewards/max:1.012136697769165 - critic/rewards/min:-0.049687668681144714 - critic/advantages/mean:-0.13530884683132172 - critic/advantages/max:2.474832057952881 - critic/advantages/min:-2.4748294353485107 - critic/returns/mean:-0.13530884683132172 - critic/returns/max:2.474832057952881 - critic/returns/min:-2.4748294353485107 - response_length/mean:1370.061279296875 - response_length/max:8192.0 - response_length/min:234.0 - response_length/clip_ratio:0.01875000074505806 - response_length_non_aborted/mean:1370.061279296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:234.0 - response_length_non_aborted/clip_ratio:0.01875000074505806 - response/aborted_ratio:0.0 - prompt_length/mean:233.5399932861328 - prompt_length/max:356.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.0001265043392777443 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7302277758717537) - timing_s/agent_loop/generate_sequences/max:np.float64(30.552216861397028) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.9634492260702245) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.552216861397028) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.44552735053003 - timing_s/reward:0.000128074549138546 - timing_s/old_log_prob:10.714552893303335 - timing_s/ref:23.502586307004094 - timing_s/adv:0.09128942340612411 - timing_s/update_actor:21.854842090047896 - timing_s/update_weights:30.26214237138629 - timing_s/step:119.25920300744474 - timing_s/stop_profile:9.516719728708267e-05 - timing_per_token_ms/adv:7.11596971239921e-05 - timing_per_token_ms/update_actor:0.01703575163249584 - timing_per_token_ms/gen:0.02960225989032427 - timing_per_token_ms/ref:0.018320160877746332 - perf/total_num_tokens:1779803 - perf/time_per_step:119.25920300744474 - perf/throughput:3730.9552535935027 - frontier/active_count:21.0 - frontier/completed_count:43.0 - frontier/blacklisted_count:1829.0 - frontier/mean_score:2.2914717966733664 - frontier/mean_frontier_pct:0.740305401312426 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:1.9819436474986998 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:144.0 - frontier/cluster_6/score:1.37077489229291 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.6782900099999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.416150592277023 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.229413890376691 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:1.7987015421520534 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:176.0 - frontier/cluster_27/score:1.6732095682357075 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:2.9927048030014807 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:2.7625419883699998 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.0017395824386996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:2.928270710481462 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.144537097865636 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.4054478169509093 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:1.9085610426470894 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:1.9788601104814623 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.065970112835141 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:1.8514648634464073 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.533193851240699 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.6244347523409997 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:2.8568659102355616 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:192.0 - frontier/cluster_63/score:1.9178309449720594 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:149.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.04118674690455398 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.028486056413984072 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.05565751221942232 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.05020999615856496 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.046329423020844016 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.03737879493543782 - cluster/prob_snapshot/cluster_27:0.034770947747265524 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.06219136222002292 - cluster/prob_snapshot/cluster_34:0.05740835156024441 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.04159812598848594 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.06085235812472693 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0445655994249335 - cluster/prob_snapshot/cluster_44:0.04998758191430061 - cluster/prob_snapshot/cluster_45:0.039661783883009664 - cluster/prob_snapshot/cluster_46:0.041122667959191395 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.06371388773522159 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.03847526887541932 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.05264227070375949 - cluster/prob_snapshot/cluster_54:0.05453834676308851 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.05936849583670995 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.03985442161081304
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 16:09:08,515:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_150
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 150}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  19%|█▉        | 150/800 [4:41:39<30:19:37, 167.96s/it]
[36m(WorkerDict pid=2825157)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825157)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:150 - global_seqlen/min:337144 - global_seqlen/max:485456 - global_seqlen/minmax_diff:148312 - global_seqlen/balanced_min:418420 - global_seqlen/balanced_max:418597 - global_seqlen/mean:418509.75 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.21234414745825078) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01082642562687397 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08781316800741479) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005178801086415506) - actor/ppo_kl:np.float64(0.00011506392576058784) - actor/pg_clipfrac_lower:np.float64(1.5490623586371822e-06) - actor/grad_norm:np.float64(0.21088762794222152) - perf/mfu/actor:np.float64(0.15836184354363642) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(105.95995712280273) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.06666666666666667) - val-aux/aime2024/reward/std@16:np.float64(0.12426769936730599) - val-aux/aime2024/reward/best@2/mean:np.float64(0.11526666666666667) - val-aux/aime2024/reward/best@2/std:np.float64(0.14536984516192072) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.019000000000000003) - val-aux/aime2024/reward/worst@2/std:np.float64(0.06444035959593135) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.06736666666666666) - val-aux/aime2024/reward/maj@2/std:np.float64(0.12449151891021958) - val-aux/aime2024/reward/best@4/mean:np.float64(0.18083333333333335) - val-aux/aime2024/reward/best@4/std:np.float64(0.14811118553475167) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.002433333333333334) - val-aux/aime2024/reward/worst@4/std:np.float64(0.016555850575860113) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.07863333333333335) - val-aux/aime2024/reward/maj@4/std:np.float64(0.12590367866461) - val-aux/aime2024/reward/best@8/mean:np.float64(0.24833333333333335) - val-aux/aime2024/reward/best@8/std:np.float64(0.12403624386056782) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2024/reward/worst@8/std:np.float64(0.002978441053825157) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.08906666666666666) - val-aux/aime2024/reward/maj@8/std:np.float64(0.12125764081984834) - val-aux/aime2024/reward/best@16/mean:np.float64(0.3035) - val-aux/aime2024/reward/best@16/std:np.float64(0.07570482728901595) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.09703333333333333) - val-aux/aime2024/reward/maj@16/std:np.float64(0.10987353283031487) - val-aux/aime2024/score/mean@16:np.float64(0.06666666666666667) - val-aux/aime2024/score/std@16:np.float64(0.12426769936730599) - val-aux/aime2024/score/best@2/mean:np.float64(0.11526666666666667) - val-aux/aime2024/score/best@2/std:np.float64(0.14536984516192072) - val-aux/aime2024/score/worst@2/mean:np.float64(0.019000000000000003) - val-aux/aime2024/score/worst@2/std:np.float64(0.06444035959593135) - val-aux/aime2024/score/maj@2/mean:np.float64(0.06736666666666666) - val-aux/aime2024/score/maj@2/std:np.float64(0.12449151891021958) - val-aux/aime2024/score/best@4/mean:np.float64(0.18083333333333335) - val-aux/aime2024/score/best@4/std:np.float64(0.14811118553475167) - val-aux/aime2024/score/worst@4/mean:np.float64(0.002433333333333334) - val-aux/aime2024/score/worst@4/std:np.float64(0.016555850575860113) - val-aux/aime2024/score/maj@4/mean:np.float64(0.07863333333333335) - val-aux/aime2024/score/maj@4/std:np.float64(0.12590367866461) - val-aux/aime2024/score/best@8/mean:np.float64(0.24833333333333335) - val-aux/aime2024/score/best@8/std:np.float64(0.12403624386056782) - val-aux/aime2024/score/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2024/score/worst@8/std:np.float64(0.002978441053825157) - val-aux/aime2024/score/maj@8/mean:np.float64(0.08906666666666666) - val-aux/aime2024/score/maj@8/std:np.float64(0.12125764081984834) - val-aux/aime2024/score/best@16/mean:np.float64(0.3035) - val-aux/aime2024/score/best@16/std:np.float64(0.07570482728901595) - val-aux/aime2024/score/worst@16/mean:np.float64(0.0) - val-aux/aime2024/score/worst@16/std:np.float64(0.0) - val-aux/aime2024/score/maj@16/mean:np.float64(0.09703333333333333) - val-aux/aime2024/score/maj@16/std:np.float64(0.10987353283031487) - val-core/aime2024/acc/mean@16:np.float64(0.06666666666666667) - val-aux/aime2024/acc/std@16:np.float64(0.12426769936730599) - val-aux/aime2024/acc/best@2/mean:np.float64(0.11526666666666667) - val-aux/aime2024/acc/best@2/std:np.float64(0.14536984516192072) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.019000000000000003) - val-aux/aime2024/acc/worst@2/std:np.float64(0.06444035959593135) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.06736666666666666) - val-aux/aime2024/acc/maj@2/std:np.float64(0.12449151891021958) - val-aux/aime2024/acc/best@4/mean:np.float64(0.18083333333333335) - val-aux/aime2024/acc/best@4/std:np.float64(0.14811118553475167) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.002433333333333334) - val-aux/aime2024/acc/worst@4/std:np.float64(0.016555850575860113) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.07863333333333335) - val-aux/aime2024/acc/maj@4/std:np.float64(0.12590367866461) - val-aux/aime2024/acc/best@8/mean:np.float64(0.24833333333333335) - val-aux/aime2024/acc/best@8/std:np.float64(0.12403624386056782) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2024/acc/worst@8/std:np.float64(0.002978441053825157) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.08906666666666666) - val-aux/aime2024/acc/maj@8/std:np.float64(0.12125764081984834) - val-core/aime2024/acc/best@16/mean:np.float64(0.3035) - val-core/aime2024/acc/best@16/std:np.float64(0.07570482728901595) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0) - val-core/aime2024/acc/maj@16/mean:np.float64(0.09703333333333333) - val-core/aime2024/acc/maj@16/std:np.float64(0.10987353283031487) - val-aux/aime2025/reward/mean@16:np.float64(0.052083333333333336) - val-aux/aime2025/reward/std@16:np.float64(0.08740966090756826) - val-aux/aime2025/reward/best@2/mean:np.float64(0.08786666666666668) - val-aux/aime2025/reward/best@2/std:np.float64(0.09814264375756772) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0188) - val-aux/aime2025/reward/worst@2/std:np.float64(0.05124223971416691) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.05246666666666666) - val-aux/aime2025/reward/maj@2/std:np.float64(0.08655708602844324) - val-aux/aime2025/reward/best@4/mean:np.float64(0.13079999999999997) - val-aux/aime2025/reward/best@4/std:np.float64(0.09517557710266784) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0033333333333333335) - val-aux/aime2025/reward/worst@4/std:np.float64(0.0181137140980196) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.062400000000000004) - val-aux/aime2025/reward/maj@4/std:np.float64(0.08699602207126303) - val-aux/aime2025/reward/best@8/mean:np.float64(0.17083333333333334) - val-aux/aime2025/reward/best@8/std:np.float64(0.07770134256509473) - val-aux/aime2025/reward/worst@8/mean:np.float64(6.666666666666667e-05) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0014892205269125785) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.07100000000000001) - val-aux/aime2025/reward/maj@8/std:np.float64(0.08342729552025943) - val-aux/aime2025/reward/best@16/mean:np.float64(0.20313333333333333) - val-aux/aime2025/reward/best@16/std:np.float64(0.054220364971369596) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.07763333333333335) - val-aux/aime2025/reward/maj@16/std:np.float64(0.07631588060116214) - val-aux/aime2025/score/mean@16:np.float64(0.052083333333333336) - val-aux/aime2025/score/std@16:np.float64(0.08740966090756826) - val-aux/aime2025/score/best@2/mean:np.float64(0.08786666666666668) - val-aux/aime2025/score/best@2/std:np.float64(0.09814264375756772) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0188) - val-aux/aime2025/score/worst@2/std:np.float64(0.05124223971416691) - val-aux/aime2025/score/maj@2/mean:np.float64(0.05246666666666666) - val-aux/aime2025/score/maj@2/std:np.float64(0.08655708602844324) - val-aux/aime2025/score/best@4/mean:np.float64(0.13079999999999997) - val-aux/aime2025/score/best@4/std:np.float64(0.09517557710266784) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0033333333333333335) - val-aux/aime2025/score/worst@4/std:np.float64(0.0181137140980196) - val-aux/aime2025/score/maj@4/mean:np.float64(0.062400000000000004) - val-aux/aime2025/score/maj@4/std:np.float64(0.08699602207126303) - val-aux/aime2025/score/best@8/mean:np.float64(0.17083333333333334) - val-aux/aime2025/score/best@8/std:np.float64(0.07770134256509473) - val-aux/aime2025/score/worst@8/mean:np.float64(6.666666666666667e-05) - val-aux/aime2025/score/worst@8/std:np.float64(0.0014892205269125785) - val-aux/aime2025/score/maj@8/mean:np.float64(0.07100000000000001) - val-aux/aime2025/score/maj@8/std:np.float64(0.08342729552025943) - val-aux/aime2025/score/best@16/mean:np.float64(0.20313333333333333) - val-aux/aime2025/score/best@16/std:np.float64(0.054220364971369596) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.07763333333333335) - val-aux/aime2025/score/maj@16/std:np.float64(0.07631588060116214) - val-core/aime2025/acc/mean@16:np.float64(0.052083333333333336) - val-aux/aime2025/acc/std@16:np.float64(0.08740966090756826) - val-aux/aime2025/acc/best@2/mean:np.float64(0.08786666666666668) - val-aux/aime2025/acc/best@2/std:np.float64(0.09814264375756772) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0188) - val-aux/aime2025/acc/worst@2/std:np.float64(0.05124223971416691) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.05246666666666666) - val-aux/aime2025/acc/maj@2/std:np.float64(0.08655708602844324) - val-aux/aime2025/acc/best@4/mean:np.float64(0.13079999999999997) - val-aux/aime2025/acc/best@4/std:np.float64(0.09517557710266784) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0033333333333333335) - val-aux/aime2025/acc/worst@4/std:np.float64(0.0181137140980196) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.062400000000000004) - val-aux/aime2025/acc/maj@4/std:np.float64(0.08699602207126303) - val-aux/aime2025/acc/best@8/mean:np.float64(0.17083333333333334) - val-aux/aime2025/acc/best@8/std:np.float64(0.07770134256509473) - val-aux/aime2025/acc/worst@8/mean:np.float64(6.666666666666667e-05) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0014892205269125785) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.07100000000000001) - val-aux/aime2025/acc/maj@8/std:np.float64(0.08342729552025943) - val-core/aime2025/acc/best@16/mean:np.float64(0.20313333333333333) - val-core/aime2025/acc/best@16/std:np.float64(0.054220364971369596) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.07763333333333335) - val-core/aime2025/acc/maj@16/std:np.float64(0.07631588060116214) - val-aux/math500/reward/mean@4:np.float64(0.669) - val-aux/math500/reward/std@4:np.float64(0.14492304845413265) - val-aux/math500/reward/best@2/mean:np.float64(0.7336320000000001) - val-aux/math500/reward/best@2/std:np.float64(0.12057811231807472) - val-aux/math500/reward/worst@2/mean:np.float64(0.6042519999999999) - val-aux/math500/reward/worst@2/std:np.float64(0.12840021415681635) - val-aux/math500/reward/maj@2/mean:np.float64(0.66898) - val-aux/math500/reward/maj@2/std:np.float64(0.14479478567591195) - val-aux/math500/reward/best@4/mean:np.float64(0.7822840000000001) - val-aux/math500/reward/best@4/std:np.float64(0.07687354080923087) - val-aux/math500/reward/worst@4/mean:np.float64(0.550106) - val-aux/math500/reward/worst@4/std:np.float64(0.09013931764676437) - val-aux/math500/reward/maj@4/mean:np.float64(0.682754) - val-aux/math500/reward/maj@4/std:np.float64(0.1321702252485895) - val-aux/math500/score/mean@4:np.float64(0.669) - val-aux/math500/score/std@4:np.float64(0.14492304845413265) - val-aux/math500/score/best@2/mean:np.float64(0.7336320000000001) - val-aux/math500/score/best@2/std:np.float64(0.12057811231807472) - val-aux/math500/score/worst@2/mean:np.float64(0.6042519999999999) - val-aux/math500/score/worst@2/std:np.float64(0.12840021415681635) - val-aux/math500/score/maj@2/mean:np.float64(0.66898) - val-aux/math500/score/maj@2/std:np.float64(0.14479478567591195) - val-aux/math500/score/best@4/mean:np.float64(0.7822840000000001) - val-aux/math500/score/best@4/std:np.float64(0.07687354080923087) - val-aux/math500/score/worst@4/mean:np.float64(0.550106) - val-aux/math500/score/worst@4/std:np.float64(0.09013931764676437) - val-aux/math500/score/maj@4/mean:np.float64(0.682754) - val-aux/math500/score/maj@4/std:np.float64(0.1321702252485895) - val-core/math500/acc/mean@4:np.float64(0.669) - val-aux/math500/acc/std@4:np.float64(0.14492304845413265) - val-aux/math500/acc/best@2/mean:np.float64(0.7336320000000001) - val-aux/math500/acc/best@2/std:np.float64(0.12057811231807472) - val-aux/math500/acc/worst@2/mean:np.float64(0.6042519999999999) - val-aux/math500/acc/worst@2/std:np.float64(0.12840021415681635) - val-aux/math500/acc/maj@2/mean:np.float64(0.66898) - val-aux/math500/acc/maj@2/std:np.float64(0.14479478567591195) - val-core/math500/acc/best@4/mean:np.float64(0.7822840000000001) - val-core/math500/acc/best@4/std:np.float64(0.07687354080923087) - val-aux/math500/acc/worst@4/mean:np.float64(0.550106) - val-aux/math500/acc/worst@4/std:np.float64(0.09013931764676437) - val-core/math500/acc/maj@4/mean:np.float64(0.682754) - val-core/math500/acc/maj@4/std:np.float64(0.1321702252485895) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.06554054054054054 - val-aux/aime2024/response_length/clip_ratio:0.17708333333333334 - val-aux/aime2025/response_length/clip_ratio:0.08333333333333333 - val-aux/math500/response_length/clip_ratio:0.0345 - training/global_step:150 - training/epoch:0 - critic/score/mean:0.5333333611488342 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5223615169525146 - critic/rewards/max:1.0053633451461792 - critic/rewards/min:-0.09285926818847656 - critic/advantages/mean:-0.13099327683448792 - critic/advantages/max:2.474832057952881 - critic/advantages/min:-2.4748306274414062 - critic/returns/mean:-0.13099327683448792 - critic/returns/max:2.474832057952881 - critic/returns/min:-2.4748306274414062 - response_length/mean:1366.37255859375 - response_length/max:8192.0 - response_length/min:298.0 - response_length/clip_ratio:0.01666666753590107 - response_length_non_aborted/mean:1366.37255859375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:298.0 - response_length_non_aborted/clip_ratio:0.01666666753590107 - response/aborted_ratio:0.0 - prompt_length/mean:235.6952362060547 - prompt_length/max:619.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010402407497167587 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9520757040008903) - timing_s/agent_loop/generate_sequences/max:np.float64(29.223983184434474) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.692514349936573) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.223983184434474) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.029054834507406 - timing_s/reward:0.00011684559285640717 - timing_s/old_log_prob:10.985769075341523 - timing_s/ref:12.06948306877166 - timing_s/adv:0.08506726007908583 - timing_s/update_actor:30.894640563987195 - timing_s/save_checkpoint:54.68504402320832 - timing_s/update_weights:23.763197500258684 - timing_s/step:163.91672850213945 - timing_s/testing:131.93467266578227 - timing_s/stop_profile:0.0004045255482196808 - timing_per_token_ms/adv:6.321239594295604e-05 - timing_per_token_ms/update_actor:0.022957413345986024 - timing_per_token_ms/gen:0.02703461008989513 - timing_per_token_ms/ref:0.008968678923721099 - perf/total_num_tokens:1674039 - perf/time_per_step:163.91672850213945 - perf/throughput:2553.1851070010684 - frontier/active_count:18.0 - frontier/completed_count:46.0 - frontier/blacklisted_count:1852.0 - frontier/mean_score:2.332209797267806 - frontier/mean_frontier_pct:0.7291364031757125 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:144.0 - frontier/cluster_6/score:1.37077489229291 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.774803006999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.591305414593916 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.229413890376691 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:2.159091079506437 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:192.0 - frontier/cluster_27/score:1.4712466977649952 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:2.8337793918589993 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.0017395824386996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:2.928270710481462 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.401175968505945 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:1.9838134718656364 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:2.2359927298529625 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:2.2852020773370234 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:3.0461790789845984 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:1.8514648634464073 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.673235695868489 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.8998061371648927 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:192.0 - frontier/cluster_63/score:2.2424816614804417 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:150.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.032653220465909356 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.06609856574297268 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.06172747069776297 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.05310685487568388 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.05143169562084237 - cluster/prob_snapshot/cluster_27:0.035046558739854725 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.06750344185203389 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.04768342655545317 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.06975431898469912 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.057198398305878854 - cluster/prob_snapshot/cluster_44:0.0472564087832941 - cluster/prob_snapshot/cluster_45:0.05326356937128512 - cluster/prob_snapshot/cluster_46:0.054435784941773716 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.07256301352174926 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.04410373337804168 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06367913143530217 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.06907626455490193 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.053418142172560946
[36m(TaskRunner pid=2823680)[0m Training Progress:  19%|█▉        | 151/800 [4:43:34<27:27:25, 152.30s/it]
[36m(TaskRunner pid=2823680)[0m step:151 - global_seqlen/min:355201 - global_seqlen/max:457475 - global_seqlen/minmax_diff:102274 - global_seqlen/balanced_min:408793 - global_seqlen/balanced_max:408943 - global_seqlen/mean:408863.5 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.27518818942674744) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011180325411260128 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09703141807403881) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006580693701228787) - actor/ppo_kl:np.float64(8.336760427655558e-05) - actor/pg_clipfrac_lower:np.float64(3.108710302512918e-07) - actor/grad_norm:np.float64(0.22627697388331094) - perf/mfu/actor:np.float64(0.21698319160840923) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(106.44194030761719) - actor/lr:np.float64(1e-06) - training/global_step:151 - training/epoch:0 - critic/score/mean:0.5644736886024475 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5538846254348755 - critic/rewards/max:1.0117061138153076 - critic/rewards/min:-0.04960203543305397 - critic/advantages/mean:-0.17292071878910065 - critic/advantages/max:2.4748141765594482 - critic/advantages/min:-2.4748339653015137 - critic/returns/mean:-0.17292071878910065 - critic/returns/max:2.4748141765594482 - critic/returns/min:-2.4748339653015137 - response_length/mean:1198.5855712890625 - response_length/max:8192.0 - response_length/min:185.0 - response_length/clip_ratio:0.007894736714661121 - response_length_non_aborted/mean:1198.5855712890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:185.0 - response_length_non_aborted/clip_ratio:0.007894736714661121 - response/aborted_ratio:0.0 - prompt_length/mean:240.50526428222656 - prompt_length/max:572.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.421484380960464e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4614176405593753) - timing_s/agent_loop/generate_sequences/max:np.float64(29.321143977344036) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.246866703909291) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.321143977344036) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:228 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.266105107031763 - timing_s/reward:0.00028859730809926987 - timing_s/old_log_prob:11.265441028401256 - timing_s/ref:21.232661850750446 - timing_s/adv:0.08419399429112673 - timing_s/update_actor:23.05244201887399 - timing_s/update_weights:28.225309495814145 - timing_s/step:115.51353397406638 - timing_s/stop_profile:6.285868585109711e-05 - timing_per_token_ms/adv:7.698025186875735e-05 - timing_per_token_ms/update_actor:0.021077308515221133 - timing_per_token_ms/gen:0.034323468021002564 - timing_per_token_ms/ref:0.01941344713333295 - perf/total_num_tokens:1635454 - perf/time_per_step:115.51353397406638 - perf/throughput:3539.5289706208177 - frontier/active_count:18.0 - frontier/completed_count:46.0 - frontier/blacklisted_count:1885.0 - frontier/mean_score:2.4390219409188494 - frontier/mean_frontier_pct:0.7730308673140249 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:12.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.2595424246050368 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.8423621048999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:2.7139137902157406 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.229413890376691 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:2.159091079506437 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:192.0 - frontier/cluster_27/score:1.9298726884354964 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.8836455743012994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.3012177077070897 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.349789497337023 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.5808231779541613 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:2.288669430305945 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:160.0 - frontier/cluster_45/score:2.4651949108970737 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.499641454135916 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:3.0323253552892186 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:2.196025404412485 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.7712649871079424 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.929864296015425 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:208.0 - frontier/cluster_63/score:2.469737163036309 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:151.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.028689606260107216 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.06474275740557255 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.061816987299638 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.05078114516529904 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0491793461980239 - cluster/prob_snapshot/cluster_27:0.04395825538048935 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.06568310404180902 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.05241667820248735 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.05352303674397793 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.05878547586492102 - cluster/prob_snapshot/cluster_44:0.0521308560413209 - cluster/prob_snapshot/cluster_45:0.05615171824818445 - cluster/prob_snapshot/cluster_46:0.05693633474322156 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.06906970245410145 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.05002062888794175 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06312332142958928 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.06673586487139324 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.056255180761922055
[36m(TaskRunner pid=2823680)[0m Training Progress:  19%|█▉        | 152/800 [4:45:32<25:33:13, 141.97s/it]
[36m(TaskRunner pid=2823680)[0m step:152 - global_seqlen/min:374049 - global_seqlen/max:530975 - global_seqlen/minmax_diff:156926 - global_seqlen/balanced_min:415723 - global_seqlen/balanced_max:415978 - global_seqlen/mean:415873.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.2618751563907911) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011631149798631668 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.01763723814883633) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008649018865677741) - actor/ppo_kl:np.float64(0.00013883638155694675) - actor/pg_clipfrac_lower:np.float64(2.187325700712487e-06) - actor/grad_norm:np.float64(0.3849417145053546) - perf/mfu/actor:np.float64(0.21937936523473495) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(106.57361602783203) - actor/lr:np.float64(1e-06) - training/global_step:152 - training/epoch:0 - critic/score/mean:0.5208333134651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5094702839851379 - critic/rewards/max:1.0144538879394531 - critic/rewards/min:-0.07613968104124069 - critic/advantages/mean:-0.12644687294960022 - critic/advantages/max:2.4747800827026367 - critic/advantages/min:-2.474846363067627 - critic/returns/mean:-0.12644687294960022 - critic/returns/max:2.4747800827026367 - critic/returns/min:-2.474846363067627 - response_length/mean:1297.6796875 - response_length/max:8192.0 - response_length/min:242.0 - response_length/clip_ratio:0.01692708395421505 - response_length_non_aborted/mean:1297.6796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:242.0 - response_length_non_aborted/clip_ratio:0.01692708395421505 - response/aborted_ratio:0.0 - prompt_length/mean:250.9375 - prompt_length/max:817.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.538605809211731e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.9082059301435947) - timing_s/agent_loop/generate_sequences/max:np.float64(29.960537555627525) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.5127863713614715) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.960537555627525) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.691384601406753 - timing_s/reward:0.00013455282896757126 - timing_s/old_log_prob:10.49583544023335 - timing_s/ref:22.736032764427364 - timing_s/adv:0.11279305815696716 - timing_s/update_actor:22.36956292577088 - timing_s/update_weights:29.740928813815117 - timing_s/step:117.52911375556141 - timing_s/stop_profile:5.532335489988327e-05 - timing_per_token_ms/adv:9.483684045827777e-05 - timing_per_token_ms/update_actor:0.018808415207258893 - timing_per_token_ms/gen:0.031798928577857065 - timing_per_token_ms/ref:0.019116544467953907 - perf/total_num_tokens:1663494 - perf/time_per_step:117.52911375556141 - perf/throughput:3538.4721854104946 - frontier/active_count:17.0 - frontier/completed_count:47.0 - frontier/blacklisted_count:1917.0 - frontier/mean_score:2.5249269279216526 - frontier/mean_frontier_pct:0.8275597800432924 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:13.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.8896534734299992 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:2.1997396531510183 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.229413890376691 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:176.0 - frontier/cluster_26/score:2.411363755654506 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:2.250910881904847 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.9185519020109094 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:1.9108523953949628 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.349789497337023 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.706576224567913 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:2.5020686012141615 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:160.0 - frontier/cluster_45/score:2.6256364376279513 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.6497490178951413 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.022627748702453 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:2.437217783088739 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.8398854909755595 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:192.0 - frontier/cluster_57/score:2.9509050072107974 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:224.0 - frontier/cluster_63/score:2.0288160141254163 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:152.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.06732060805578766 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.051247601962035524 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.05193892627202372 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.056177834389830546 - cluster/prob_snapshot/cluster_27:0.05243974429548304 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0679938582575201 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.04451736041905148 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.05474333141269788 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0630554351456439 - cluster/prob_snapshot/cluster_44:0.05829099619723377 - cluster/prob_snapshot/cluster_45:0.061169771095332615 - cluster/prob_snapshot/cluster_46:0.06173152480743236 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.07041852590283436 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.056780158808162146 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06616115732186774 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.068747592480179 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.04726557317688395
[36m(TaskRunner pid=2823680)[0m Training Progress:  19%|█▉        | 153/800 [4:47:32<24:18:53, 135.29s/it]
[36m(TaskRunner pid=2823680)[0m step:153 - global_seqlen/min:352944 - global_seqlen/max:479793 - global_seqlen/minmax_diff:126849 - global_seqlen/balanced_min:410255 - global_seqlen/balanced_max:410394 - global_seqlen/mean:410341.75 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.2492180049736449) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01188887283205986 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.032447149278596044) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006934961394166594) - actor/ppo_kl:np.float64(0.00013678938247760518) - actor/pg_clipfrac_lower:np.float64(6.996660982970733e-06) - actor/grad_norm:np.float64(0.22480597967902818) - perf/mfu/actor:np.float64(0.21562342964544134) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.86470413208008) - actor/lr:np.float64(1e-06) - training/global_step:153 - training/epoch:0 - critic/score/mean:0.5558510422706604 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5444384813308716 - critic/rewards/max:1.0075372457504272 - critic/rewards/min:-0.048033710569143295 - critic/advantages/mean:-0.15193553268909454 - critic/advantages/max:2.474850654602051 - critic/advantages/min:-2.4748213291168213 - critic/returns/mean:-0.15193553268909454 - critic/returns/max:2.474850654602051 - critic/returns/min:-2.4748213291168213 - response_length/mean:1302.78857421875 - response_length/max:8192.0 - response_length/min:204.0 - response_length/clip_ratio:0.017287233844399452 - response_length_non_aborted/mean:1302.78857421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:204.0 - response_length_non_aborted/clip_ratio:0.017287233844399452 - response/aborted_ratio:0.0 - prompt_length/mean:249.46807861328125 - prompt_length/max:817.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.725188672542572e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5811501322314143) - timing_s/agent_loop/generate_sequences/max:np.float64(29.990737264975905) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.44934379154347) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.990737264975905) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.948125957511365 - timing_s/reward:0.00012553948909044266 - timing_s/old_log_prob:10.632852984592319 - timing_s/ref:23.99587473832071 - timing_s/adv:0.07376515213400126 - timing_s/update_actor:22.406721874140203 - timing_s/update_weights:30.049453874118626 - timing_s/step:119.51093073841184 - timing_s/stop_profile:4.92241233587265e-05 - timing_per_token_ms/adv:6.31931309118427e-05 - timing_per_token_ms/update_actor:0.019195390611078587 - timing_per_token_ms/gen:0.03261021107292496 - timing_per_token_ms/ref:0.020556786094987574 - perf/total_num_tokens:1641367 - perf/time_per_step:119.51093073841184 - perf/throughput:3433.5081106360476 - frontier/active_count:13.0 - frontier/completed_count:51.0 - frontier/blacklisted_count:1951.0 - frontier/mean_score:2.544323485506468 - frontier/mean_frontier_pct:0.8285511596706487 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:13.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.922757431400999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:2.4605897232636833 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:176.0 - frontier/cluster_26/score:2.587954628958154 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:2.4756376173333927 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:1.9108523953949628 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.544852648135916 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.7946033571975386 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:2.651448020849913 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:176.0 - frontier/cluster_45/score:2.7379455063395657 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.0158394240917166 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.8879198436828917 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.365633505047558 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:240.0 - frontier/cluster_63/score:1.7201712098877915 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:153.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.08836435146861842 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.07439153615369311 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.07824218662869982 - cluster/prob_snapshot/cluster_27:0.07484648235831227 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.05777120976830242 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.07693907521019794 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.08448984189304211 - cluster/prob_snapshot/cluster_44:0.08016179594584003 - cluster/prob_snapshot/cluster_45:0.08277689295212687 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.09117851929149494 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.08731109921703965 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.07152070446905395 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.05200630464357851
[36m(TaskRunner pid=2823680)[0m Training Progress:  19%|█▉        | 154/800 [4:49:18<22:41:07, 126.42s/it]
[36m(TaskRunner pid=2823680)[0m step:154 - global_seqlen/min:343691 - global_seqlen/max:414818 - global_seqlen/minmax_diff:71127 - global_seqlen/balanced_min:383124 - global_seqlen/balanced_max:383266 - global_seqlen/mean:383194.0 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.22296403162181377) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012121208943426609 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09517553944897372) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00043842406227426813) - actor/ppo_kl:np.float64(8.839423953666445e-05) - actor/pg_clipfrac_lower:np.float64(1.6249409503759626e-06) - actor/grad_norm:np.float64(0.2231499862212401) - perf/mfu/actor:np.float64(0.2213777803452303) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.6801986694336) - actor/lr:np.float64(1e-06) - training/global_step:154 - training/epoch:0 - critic/score/mean:0.5528350472450256 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5421928763389587 - critic/rewards/max:1.0063529014587402 - critic/rewards/min:-0.07172933220863342 - critic/advantages/mean:-0.13047146797180176 - critic/advantages/max:2.4747469425201416 - critic/advantages/min:-2.474853515625 - critic/returns/mean:-0.13047146797180176 - critic/returns/max:2.4747469425201416 - critic/returns/min:-2.474853515625 - response_length/mean:1112.80029296875 - response_length/max:8192.0 - response_length/min:73.0 - response_length/clip_ratio:0.005154639016836882 - response_length_non_aborted/mean:1112.80029296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:73.0 - response_length_non_aborted/clip_ratio:0.005154639016836882 - response/aborted_ratio:0.0 - prompt_length/mean:235.226806640625 - prompt_length/max:667.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.581951260566711e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.666016380302608) - timing_s/agent_loop/generate_sequences/max:np.float64(28.46197447180748) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.844305171171982) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.46197447180748) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:256 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.35890954080969 - timing_s/reward:0.000217377208173275 - timing_s/old_log_prob:9.657664314843714 - timing_s/ref:18.91052504070103 - timing_s/adv:0.10116150602698326 - timing_s/update_actor:20.43920135591179 - timing_s/update_weights:25.600654783658683 - timing_s/step:105.48414129205048 - timing_s/stop_profile:6.16069883108139e-05 - timing_per_token_ms/adv:9.670634157687806e-05 - timing_per_token_ms/update_actor:0.01953905655928222 - timing_per_token_ms/gen:0.035156629266987705 - timing_per_token_ms/ref:0.01807770332616781 - perf/total_num_tokens:1532776 - perf/time_per_step:105.48414129205048 - perf/throughput:3632.7166842934557 - frontier/active_count:9.0 - frontier/completed_count:55.0 - frontier/blacklisted_count:1981.0 - frontier/mean_score:2.5302188313203167 - frontier/mean_frontier_pct:0.8085788532088349 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:13.0 - frontier/replay_slots_count:24.0 - frontier/replay_pool_size:4264.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.9459302019806994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:2.6224128062845784 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:192.0 - frontier/cluster_26/score:2.7115682402707075 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:1.6375966767764738 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.856222350038277 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:2.7560136145949388 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:176.0 - frontier/cluster_45/score:2.816561854437696 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.921543890578024 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:256.0 - frontier/cluster_63/score:1.504119846921454 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:154.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.12936650930980967 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.11515968385479103 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.11907482321316143 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.07191282590112943 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.12542711126987321 - cluster/prob_snapshot/cluster_44:0.12102658124443719 - cluster/prob_snapshot/cluster_45:0.12368547466562013 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.1282956176848197 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0660513728563582
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 16:21:47,449:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  19%|█▉        | 155/800 [4:50:57<21:12:52, 118.41s/it]
[36m(TaskRunner pid=2823680)[0m step:155 - global_seqlen/min:326576 - global_seqlen/max:405216 - global_seqlen/minmax_diff:78640 - global_seqlen/balanced_min:374936 - global_seqlen/balanced_max:375143 - global_seqlen/mean:375011.25 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.21509526047925903) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011994503438472748 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.018825647774065146) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00032310139465822294) - actor/ppo_kl:np.float64(-7.998580859410092e-05) - actor/pg_clipfrac_lower:np.float64(6.871317380947008e-06) - actor/grad_norm:np.float64(0.22786405363253184) - perf/mfu/actor:np.float64(0.18176190039365442) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.7879638671875) - actor/lr:np.float64(1e-06) - training/global_step:155 - training/epoch:0 - critic/score/mean:0.5964285731315613 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5854892134666443 - critic/rewards/max:1.004967212677002 - critic/rewards/min:-0.06428442150354385 - critic/advantages/mean:-0.11624269932508469 - critic/advantages/max:2.4748270511627197 - critic/advantages/min:-2.4747934341430664 - critic/returns/mean:-0.11624269932508469 - critic/returns/max:2.4748270511627197 - critic/returns/min:-2.4747934341430664 - response_length/mean:1125.62744140625 - response_length/max:8192.0 - response_length/min:183.0 - response_length/clip_ratio:0.008333333767950535 - response_length_non_aborted/mean:1125.62744140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:183.0 - response_length_non_aborted/clip_ratio:0.008333333767950535 - response/aborted_ratio:0.0 - prompt_length/mean:239.35238647460938 - prompt_length/max:515.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.052921086549759e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4295200314372778) - timing_s/agent_loop/generate_sequences/max:np.float64(28.618311065249145) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.587589404302889) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.618311065249145) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:186 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.113515608944 - timing_s/reward:0.00012401025742292404 - timing_s/old_log_prob:9.906398287042975 - timing_s/ref:12.5899030091241 - timing_s/adv:0.08410092815756798 - timing_s/update_actor:24.158204896375537 - timing_s/update_weights:21.256691053509712 - timing_s/step:99.51702021434903 - timing_s/stop_profile:5.467887967824936e-05 - timing_per_token_ms/adv:7.334918462733877e-05 - timing_per_token_ms/update_actor:0.02106973930049158 - timing_per_token_ms/gen:0.032906004385854656 - timing_per_token_ms/ref:0.010980367761535013 - perf/total_num_tokens:1500045 - perf/time_per_step:99.51702021434903 - perf/throughput:3768.312688545797 - frontier/active_count:8.0 - frontier/completed_count:56.0 - frontier/blacklisted_count:2003.0 - frontier/mean_score:2.6109368430591218 - frontier/mean_frontier_pct:0.841314806642912 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:13.0 - frontier/replay_slots_count:56.0 - frontier/replay_pool_size:4924.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.9621511413864896 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:224.0 - frontier/cluster_23/score:2.7356889643992046 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:192.0 - frontier/cluster_26/score:2.798097768189495 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:1.4463176737435317 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.8993556450267937 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:2.829209530216457 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:192.0 - frontier/cluster_45/score:2.8715932981063865 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:192.0 - frontier/cluster_53/score:2.3450807234046165 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:155.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.14181457267249833 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.13097257463694126 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.13396042954983378 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.06924323340051304 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.1388082046457003 - cluster/prob_snapshot/cluster_44:0.13544992182296503 - cluster/prob_snapshot/cluster_45:0.1374790674150253 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.11227199585652303 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|█▉        | 156/800 [4:52:49<20:49:17, 116.39s/it]
[36m(TaskRunner pid=2823680)[0m step:156 - global_seqlen/min:332335 - global_seqlen/max:392200 - global_seqlen/minmax_diff:59865 - global_seqlen/balanced_min:369224 - global_seqlen/balanced_max:369271 - global_seqlen/mean:369256.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.21820120020614317) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013715478591620922 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.01806318311719224) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003842765175553116) - actor/ppo_kl:np.float64(4.221920840781953e-05) - actor/pg_clipfrac_lower:np.float64(1.6601857926919668e-06) - actor/grad_norm:np.float64(0.23452822988231978) - perf/mfu/actor:np.float64(0.19812793887156033) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(111.77278518676758) - actor/lr:np.float64(1e-06) - training/global_step:156 - training/epoch:0 - critic/score/mean:0.6419270634651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6311042308807373 - critic/rewards/max:1.0048258304595947 - critic/rewards/min:-0.045601729303598404 - critic/advantages/mean:-0.12398263067007065 - critic/advantages/max:2.474792718887329 - critic/advantages/min:-2.474860906600952 - critic/returns/mean:-0.12398263067007065 - critic/returns/max:2.474792718887329 - critic/returns/min:-2.474860906600952 - response_length/mean:1042.1171875 - response_length/max:8192.0 - response_length/min:175.0 - response_length/clip_ratio:0.0065104165114462376 - response_length_non_aborted/mean:1042.1171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:175.0 - response_length_non_aborted/clip_ratio:0.0065104165114462376 - response/aborted_ratio:0.0 - prompt_length/mean:233.3645782470703 - prompt_length/max:393.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00011949241161346436 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.720786003395915) - timing_s/agent_loop/generate_sequences/max:np.float64(28.086972310207784) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.019721543862943) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.086972310207784) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:237 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.956574444659054 - timing_s/reward:0.0001376662403345108 - timing_s/old_log_prob:10.056281951256096 - timing_s/ref:21.052758366800845 - timing_s/adv:0.07697720918804407 - timing_s/update_actor:21.742003521881998 - timing_s/update_weights:28.21478006336838 - timing_s/step:111.48471198696643 - timing_s/stop_profile:5.5816955864429474e-05 - timing_per_token_ms/adv:7.858265278442997e-05 - timing_per_token_ms/update_actor:0.022195456702310195 - timing_per_token_ms/gen:0.03742952978419215 - timing_per_token_ms/ref:0.02149183658829981 - perf/total_num_tokens:1477025 - perf/time_per_step:111.48471198696643 - perf/throughput:3312.169385549199 - frontier/active_count:6.0 - frontier/completed_count:58.0 - frontier/blacklisted_count:2033.0 - frontier/mean_score:2.38763622915002 - frontier/mean_frontier_pct:0.8463351901854419 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:3.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:13.0 - frontier/replay_slots_count:64.0 - frontier/replay_pool_size:5099.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.3735057989705424 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:208.0 - frontier/cluster_26/score:2.8586684377326463 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:192.0 - frontier/cluster_37/score:1.3124223716204722 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.9295489515187554 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:208.0 - frontier/cluster_45/score:2.31011530867447 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:302.0 - frontier/cluster_48/score:0.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:192.0 - frontier/cluster_53/score:2.5415565063832313 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:156.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.16568030548323884 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.19954662012802443 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.09161239022353672 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.20449436669851312 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.16125539284913412 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.17741092461755267 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|█▉        | 157/800 [4:54:45<20:45:43, 116.24s/it]
[36m(TaskRunner pid=2823680)[0m step:157 - global_seqlen/min:342456 - global_seqlen/max:466612 - global_seqlen/minmax_diff:124156 - global_seqlen/balanced_min:396066 - global_seqlen/balanced_max:396382 - global_seqlen/mean:396229.25 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.16854971224286905) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01091775856912136 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04833842431253288) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007990684148353466) - actor/ppo_kl:np.float64(-0.0005478016775626315) - actor/pg_clipfrac_lower:np.float64(8.428477974575799e-05) - actor/grad_norm:np.float64(0.2152163734038671) - perf/mfu/actor:np.float64(0.2223056408264006) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(136.0404930114746) - actor/lr:np.float64(1e-06) - training/global_step:157 - training/epoch:0 - critic/score/mean:0.5486842393875122 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5384269952774048 - critic/rewards/max:1.0506749153137207 - critic/rewards/min:-0.06871568411588669 - critic/advantages/mean:-0.08872552961111069 - critic/advantages/max:2.474825143814087 - critic/advantages/min:-2.4748458862304688 - critic/returns/mean:-0.08872552961111069 - critic/returns/max:2.474825143814087 - critic/returns/min:-2.4748458862304688 - response_length/mean:1161.1842041015625 - response_length/max:8192.0 - response_length/min:175.0 - response_length/clip_ratio:0.009210526011884212 - response_length_non_aborted/mean:1161.1842041015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:175.0 - response_length_non_aborted/clip_ratio:0.009210526011884212 - response/aborted_ratio:0.0 - prompt_length/mean:244.95790100097656 - prompt_length/max:508.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.621811866760254e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4353533033281565) - timing_s/agent_loop/generate_sequences/max:np.float64(30.204358558170497) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.3360341325014815) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.204358558170497) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:188 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.155900774523616 - timing_s/reward:0.00022304244339466095 - timing_s/old_log_prob:10.084504701197147 - timing_s/ref:23.65835953876376 - timing_s/adv:0.06543163303285837 - timing_s/update_actor:21.23977991193533 - timing_s/update_weights:28.022597358562052 - timing_s/step:115.61302645038813 - timing_s/stop_profile:6.497371941804886e-05 - timing_per_token_ms/adv:6.122727828741796e-05 - timing_per_token_ms/update_actor:0.01987500319269907 - timing_per_token_ms/gen:0.036437281330904944 - timing_per_token_ms/ref:0.022138175316154092 - perf/total_num_tokens:1584917 - perf/time_per_step:115.61302645038813 - perf/throughput:3427.2024716006367 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:33.0 - frontier/mean_score:2.0374999999999996 - frontier/mean_frontier_pct:0.008368472729316774 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.7 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.3 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.3 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.3 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.7 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:157.0 - cluster/prob_snapshot/cluster_0:0.015337423312883439 - cluster/prob_snapshot/cluster_1:0.013036809815950923 - cluster/prob_snapshot/cluster_2:0.013036809815950923 - cluster/prob_snapshot/cluster_3:0.015337423312883439 - cluster/prob_snapshot/cluster_4:0.015337423312883439 - cluster/prob_snapshot/cluster_5:0.013036809815950923 - cluster/prob_snapshot/cluster_6:0.015337423312883439 - cluster/prob_snapshot/cluster_7:0.015337423312883439 - cluster/prob_snapshot/cluster_8:0.015337423312883439 - cluster/prob_snapshot/cluster_9:0.015337423312883439 - cluster/prob_snapshot/cluster_10:0.015337423312883439 - cluster/prob_snapshot/cluster_11:0.015337423312883439 - cluster/prob_snapshot/cluster_12:0.017638036809815953 - cluster/prob_snapshot/cluster_13:0.015337423312883439 - cluster/prob_snapshot/cluster_14:0.017638036809815953 - cluster/prob_snapshot/cluster_15:0.015337423312883439 - cluster/prob_snapshot/cluster_16:0.015337423312883439 - cluster/prob_snapshot/cluster_17:0.015337423312883439 - cluster/prob_snapshot/cluster_18:0.015337423312883439 - cluster/prob_snapshot/cluster_19:0.015337423312883439 - cluster/prob_snapshot/cluster_20:0.015337423312883439 - cluster/prob_snapshot/cluster_21:0.015337423312883439 - cluster/prob_snapshot/cluster_22:0.015337423312883439 - cluster/prob_snapshot/cluster_23:0.015337423312883439 - cluster/prob_snapshot/cluster_24:0.015337423312883439 - cluster/prob_snapshot/cluster_25:0.015337423312883439 - cluster/prob_snapshot/cluster_26:0.017638036809815953 - cluster/prob_snapshot/cluster_27:0.015337423312883439 - cluster/prob_snapshot/cluster_28:0.015337423312883439 - cluster/prob_snapshot/cluster_29:0.015337423312883439 - cluster/prob_snapshot/cluster_30:0.015337423312883439 - cluster/prob_snapshot/cluster_31:0.017638036809815953 - cluster/prob_snapshot/cluster_32:0.015337423312883439 - cluster/prob_snapshot/cluster_33:0.015337423312883439 - cluster/prob_snapshot/cluster_34:0.017638036809815953 - cluster/prob_snapshot/cluster_35:0.015337423312883439 - cluster/prob_snapshot/cluster_36:0.015337423312883439 - cluster/prob_snapshot/cluster_37:0.017638036809815953 - cluster/prob_snapshot/cluster_38:0.022239263803680985 - cluster/prob_snapshot/cluster_39:0.017638036809815953 - cluster/prob_snapshot/cluster_40:0.015337423312883439 - cluster/prob_snapshot/cluster_41:0.015337423312883439 - cluster/prob_snapshot/cluster_42:0.015337423312883439 - cluster/prob_snapshot/cluster_43:0.015337423312883439 - cluster/prob_snapshot/cluster_44:0.015337423312883439 - cluster/prob_snapshot/cluster_45:0.015337423312883439 - cluster/prob_snapshot/cluster_46:0.015337423312883439 - cluster/prob_snapshot/cluster_47:0.015337423312883439 - cluster/prob_snapshot/cluster_48:0.015337423312883439 - cluster/prob_snapshot/cluster_49:0.015337423312883439 - cluster/prob_snapshot/cluster_50:0.015337423312883439 - cluster/prob_snapshot/cluster_51:0.015337423312883439 - cluster/prob_snapshot/cluster_52:0.015337423312883439 - cluster/prob_snapshot/cluster_53:0.013036809815950923 - cluster/prob_snapshot/cluster_54:0.017638036809815953 - cluster/prob_snapshot/cluster_55:0.015337423312883439 - cluster/prob_snapshot/cluster_56:0.013036809815950923 - cluster/prob_snapshot/cluster_57:0.017638036809815953 - cluster/prob_snapshot/cluster_58:0.015337423312883439 - cluster/prob_snapshot/cluster_59:0.017638036809815953 - cluster/prob_snapshot/cluster_60:0.015337423312883439 - cluster/prob_snapshot/cluster_61:0.015337423312883439 - cluster/prob_snapshot/cluster_62:0.015337423312883439 - cluster/prob_snapshot/cluster_63:0.015337423312883439
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 16:27:10,977:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|█▉        | 158/800 [4:56:35<20:22:56, 114.29s/it]
[36m(TaskRunner pid=2823680)[0m step:158 - global_seqlen/min:304784 - global_seqlen/max:472143 - global_seqlen/minmax_diff:167359 - global_seqlen/balanced_min:375153 - global_seqlen/balanced_max:375277 - global_seqlen/mean:375206.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.2158750833477825) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01143551617860794 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06066505616763607) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011207462979048917) - actor/ppo_kl:np.float64(-0.000969791750691229) - actor/pg_clipfrac_lower:np.float64(0.00014605072679311584) - actor/grad_norm:np.float64(0.270753958572944) - perf/mfu/actor:np.float64(0.206276814138696) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(114.90528106689453) - actor/lr:np.float64(1e-06) - training/global_step:158 - training/epoch:0 - critic/score/mean:0.609375 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6000379323959351 - critic/rewards/max:1.0032340288162231 - critic/rewards/min:-0.05294059216976166 - critic/advantages/mean:-0.08428782224655151 - critic/advantages/max:2.4748375415802 - critic/advantages/min:-2.4748544692993164 - critic/returns/mean:-0.08428782224655151 - critic/returns/max:2.4748375415802 - critic/returns/min:-2.4748544692993164 - response_length/mean:1013.7369995117188 - response_length/max:8192.0 - response_length/min:217.0 - response_length/clip_ratio:0.0052083334885537624 - response_length_non_aborted/mean:1013.7369995117188 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:217.0 - response_length_non_aborted/clip_ratio:0.0052083334885537624 - response/aborted_ratio:0.0 - prompt_length/mean:241.5416717529297 - prompt_length/max:404.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.13841649889946e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.9808089677244425) - timing_s/agent_loop/generate_sequences/max:np.float64(28.965649355202913) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.804235228663856) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.965649355202913) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:228 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.378185064531863 - timing_s/reward:0.00012015830725431442 - timing_s/old_log_prob:9.303621319122612 - timing_s/ref:20.50018125027418 - timing_s/adv:0.08077431097626686 - timing_s/update_actor:21.41078755725175 - timing_s/update_weights:27.460833373479545 - timing_s/step:109.52244719769806 - timing_s/stop_profile:5.7131052017211914e-05 - timing_per_token_ms/adv:8.378608560958915e-05 - timing_per_token_ms/update_actor:0.02220911645743055 - timing_per_token_ms/gen:0.03901892629186547 - timing_per_token_ms/ref:0.02126455701680008 - perf/total_num_tokens:1500825 - perf/time_per_step:109.52244719769806 - perf/throughput:3425.8388083925693 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:65.0 - frontier/mean_score:2.106875 - frontier/mean_frontier_pct:0.020126230474925434 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.7 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.7 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.7 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.3 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.3 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.3 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9299999999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.3 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.09 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.9 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:158.0 - cluster/prob_snapshot/cluster_0:0.014832393948383269 - cluster/prob_snapshot/cluster_1:0.012607534856125778 - cluster/prob_snapshot/cluster_2:0.012607534856125778 - cluster/prob_snapshot/cluster_3:0.014832393948383269 - cluster/prob_snapshot/cluster_4:0.014832393948383269 - cluster/prob_snapshot/cluster_5:0.012607534856125778 - cluster/prob_snapshot/cluster_6:0.012607534856125778 - cluster/prob_snapshot/cluster_7:0.014832393948383269 - cluster/prob_snapshot/cluster_8:0.017057253040640756 - cluster/prob_snapshot/cluster_9:0.014832393948383269 - cluster/prob_snapshot/cluster_10:0.021506971225155738 - cluster/prob_snapshot/cluster_11:0.012607534856125778 - cluster/prob_snapshot/cluster_12:0.017057253040640756 - cluster/prob_snapshot/cluster_13:0.014832393948383269 - cluster/prob_snapshot/cluster_14:0.017057253040640756 - cluster/prob_snapshot/cluster_15:0.014832393948383269 - cluster/prob_snapshot/cluster_16:0.017057253040640756 - cluster/prob_snapshot/cluster_17:0.014832393948383269 - cluster/prob_snapshot/cluster_18:0.014832393948383269 - cluster/prob_snapshot/cluster_19:0.014832393948383269 - cluster/prob_snapshot/cluster_20:0.014832393948383269 - cluster/prob_snapshot/cluster_21:0.014832393948383269 - cluster/prob_snapshot/cluster_22:0.014832393948383269 - cluster/prob_snapshot/cluster_23:0.014832393948383269 - cluster/prob_snapshot/cluster_24:0.014832393948383269 - cluster/prob_snapshot/cluster_25:0.014832393948383269 - cluster/prob_snapshot/cluster_26:0.017057253040640756 - cluster/prob_snapshot/cluster_27:0.014832393948383269 - cluster/prob_snapshot/cluster_28:0.014832393948383269 - cluster/prob_snapshot/cluster_29:0.014832393948383269 - cluster/prob_snapshot/cluster_30:0.014832393948383269 - cluster/prob_snapshot/cluster_31:0.017057253040640756 - cluster/prob_snapshot/cluster_32:0.014832393948383269 - cluster/prob_snapshot/cluster_33:0.014832393948383269 - cluster/prob_snapshot/cluster_34:0.017057253040640756 - cluster/prob_snapshot/cluster_35:0.014832393948383269 - cluster/prob_snapshot/cluster_36:0.017057253040640756 - cluster/prob_snapshot/cluster_37:0.018614654405221 - cluster/prob_snapshot/cluster_38:0.021729457134381486 - cluster/prob_snapshot/cluster_39:0.017057253040640756 - cluster/prob_snapshot/cluster_40:0.021506971225155738 - cluster/prob_snapshot/cluster_41:0.014832393948383269 - cluster/prob_snapshot/cluster_42:0.014832393948383269 - cluster/prob_snapshot/cluster_43:0.017057253040640756 - cluster/prob_snapshot/cluster_44:0.014832393948383269 - cluster/prob_snapshot/cluster_45:0.017057253040640756 - cluster/prob_snapshot/cluster_46:0.014832393948383269 - cluster/prob_snapshot/cluster_47:0.014832393948383269 - cluster/prob_snapshot/cluster_48:0.014832393948383269 - cluster/prob_snapshot/cluster_49:0.017057253040640756 - cluster/prob_snapshot/cluster_50:0.014832393948383269 - cluster/prob_snapshot/cluster_51:0.014832393948383269 - cluster/prob_snapshot/cluster_52:0.014832393948383269 - cluster/prob_snapshot/cluster_53:0.012607534856125778 - cluster/prob_snapshot/cluster_54:0.017057253040640756 - cluster/prob_snapshot/cluster_55:0.014832393948383269 - cluster/prob_snapshot/cluster_56:0.015499851676060515 - cluster/prob_snapshot/cluster_57:0.017057253040640756 - cluster/prob_snapshot/cluster_58:0.021506971225155738 - cluster/prob_snapshot/cluster_59:0.018614654405221 - cluster/prob_snapshot/cluster_60:0.014832393948383269 - cluster/prob_snapshot/cluster_61:0.014832393948383269 - cluster/prob_snapshot/cluster_62:0.014832393948383269 - cluster/prob_snapshot/cluster_63:0.012607534856125778
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|█▉        | 159/800 [4:58:26<20:10:55, 113.35s/it]
[36m(TaskRunner pid=2823680)[0m step:159 - global_seqlen/min:331822 - global_seqlen/max:451925 - global_seqlen/minmax_diff:120103 - global_seqlen/balanced_min:396702 - global_seqlen/balanced_max:396805 - global_seqlen/mean:396754.5 - frontier/skipped_zero_acc_count:42.0 - actor/entropy:np.float64(0.18821176384077515) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01312216091901064 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.057403018639888614) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004586113818097443) - actor/ppo_kl:np.float64(-3.1159876521298938e-06) - actor/pg_clipfrac_lower:np.float64(1.4771626516702844e-06) - actor/grad_norm:np.float64(0.22414284538138995) - perf/mfu/actor:np.float64(0.23202041076821558) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(124.16449737548828) - actor/lr:np.float64(1e-06) - training/global_step:159 - training/epoch:0 - critic/score/mean:0.604651153087616 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5940061211585999 - critic/rewards/max:1.0027477741241455 - critic/rewards/min:-0.08037709444761276 - critic/advantages/mean:-0.16875477135181427 - critic/advantages/max:2.474855661392212 - critic/advantages/min:-2.474862813949585 - critic/returns/mean:-0.16875477135181427 - critic/returns/max:2.474855661392212 - critic/returns/min:-2.474862813949585 - response_length/mean:1121.7020263671875 - response_length/max:8192.0 - response_length/min:5.0 - response_length/clip_ratio:0.014534884132444859 - response_length_non_aborted/mean:1121.7020263671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:5.0 - response_length_non_aborted/clip_ratio:0.014534884132444859 - response/aborted_ratio:0.0 - prompt_length/mean:237.093017578125 - prompt_length/max:347.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.30506905913353e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.20750104077160358) - timing_s/agent_loop/generate_sequences/max:np.float64(28.922722754999995) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.032057825065749) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.922722754999995) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:216 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.48154655750841 - timing_s/reward:0.00021398253738880157 - timing_s/old_log_prob:9.455767611972988 - timing_s/ref:20.58283055666834 - timing_s/adv:0.06475437618792057 - timing_s/update_actor:20.218179329298437 - timing_s/update_weights:28.72788705304265 - timing_s/step:110.94591160211712 - timing_s/stop_profile:6.219558417797089e-05 - timing_per_token_ms/adv:6.926705559273143e-05 - timing_per_token_ms/update_actor:0.02162716767623764 - timing_per_token_ms/gen:0.04079341967279843 - timing_per_token_ms/ref:0.022017231148780222 - perf/total_num_tokens:1587018 - perf/time_per_step:110.94591160211712 - perf/throughput:3576.1074407398796 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:107.0 - frontier/mean_score:2.129328125 - frontier/mean_frontier_pct:0.033173135240511346 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.7 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.3 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9299999999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.51 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.09 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.9299999999999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.6569999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:159.0 - cluster/prob_snapshot/cluster_0:0.014675990812829753 - cluster/prob_snapshot/cluster_1:0.01533641039940709 - cluster/prob_snapshot/cluster_2:0.01247459219090529 - cluster/prob_snapshot/cluster_3:0.014675990812829753 - cluster/prob_snapshot/cluster_4:0.014675990812829753 - cluster/prob_snapshot/cluster_5:0.01247459219090529 - cluster/prob_snapshot/cluster_6:0.010933613155558165 - cluster/prob_snapshot/cluster_7:0.014675990812829753 - cluster/prob_snapshot/cluster_8:0.018418368470101337 - cluster/prob_snapshot/cluster_9:0.01247459219090529 - cluster/prob_snapshot/cluster_10:0.025903123784644513 - cluster/prob_snapshot/cluster_11:0.01247459219090529 - cluster/prob_snapshot/cluster_12:0.016877389434754215 - cluster/prob_snapshot/cluster_13:0.014675990812829753 - cluster/prob_snapshot/cluster_14:0.016877389434754215 - cluster/prob_snapshot/cluster_15:0.014675990812829753 - cluster/prob_snapshot/cluster_16:0.016877389434754215 - cluster/prob_snapshot/cluster_17:0.014675990812829753 - cluster/prob_snapshot/cluster_18:0.01247459219090529 - cluster/prob_snapshot/cluster_19:0.014675990812829753 - cluster/prob_snapshot/cluster_20:0.014675990812829753 - cluster/prob_snapshot/cluster_21:0.014675990812829753 - cluster/prob_snapshot/cluster_22:0.01247459219090529 - cluster/prob_snapshot/cluster_23:0.014675990812829753 - cluster/prob_snapshot/cluster_24:0.014675990812829753 - cluster/prob_snapshot/cluster_25:0.014675990812829753 - cluster/prob_snapshot/cluster_26:0.014015571226252414 - cluster/prob_snapshot/cluster_27:0.016877389434754215 - cluster/prob_snapshot/cluster_28:0.014675990812829753 - cluster/prob_snapshot/cluster_29:0.014675990812829753 - cluster/prob_snapshot/cluster_30:0.014675990812829753 - cluster/prob_snapshot/cluster_31:0.018418368470101337 - cluster/prob_snapshot/cluster_32:0.014675990812829753 - cluster/prob_snapshot/cluster_33:0.014675990812829753 - cluster/prob_snapshot/cluster_34:0.016877389434754215 - cluster/prob_snapshot/cluster_35:0.016877389434754215 - cluster/prob_snapshot/cluster_36:0.016877389434754215 - cluster/prob_snapshot/cluster_37:0.018418368470101337 - cluster/prob_snapshot/cluster_38:0.021500326540795586 - cluster/prob_snapshot/cluster_39:0.016877389434754215 - cluster/prob_snapshot/cluster_40:0.021280186678603142 - cluster/prob_snapshot/cluster_41:0.016877389434754215 - cluster/prob_snapshot/cluster_42:0.014675990812829753 - cluster/prob_snapshot/cluster_43:0.016877389434754215 - cluster/prob_snapshot/cluster_44:0.014675990812829753 - cluster/prob_snapshot/cluster_45:0.016877389434754215 - cluster/prob_snapshot/cluster_46:0.014675990812829753 - cluster/prob_snapshot/cluster_47:0.014675990812829753 - cluster/prob_snapshot/cluster_48:0.014675990812829753 - cluster/prob_snapshot/cluster_49:0.018418368470101337 - cluster/prob_snapshot/cluster_50:0.014675990812829753 - cluster/prob_snapshot/cluster_51:0.014675990812829753 - cluster/prob_snapshot/cluster_52:0.014675990812829753 - cluster/prob_snapshot/cluster_53:0.01247459219090529 - cluster/prob_snapshot/cluster_54:0.018418368470101337 - cluster/prob_snapshot/cluster_55:0.014675990812829753 - cluster/prob_snapshot/cluster_56:0.01533641039940709 - cluster/prob_snapshot/cluster_57:0.016877389434754215 - cluster/prob_snapshot/cluster_58:0.021500326540795586 - cluster/prob_snapshot/cluster_59:0.019497053794844323 - cluster/prob_snapshot/cluster_60:0.014675990812829753 - cluster/prob_snapshot/cluster_61:0.014675990812829753 - cluster/prob_snapshot/cluster_62:0.014675990812829753 - cluster/prob_snapshot/cluster_63:0.01247459219090529
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|██        | 160/800 [5:00:29<20:40:52, 116.33s/it]
[36m(TaskRunner pid=2823680)[0m step:160 - global_seqlen/min:362810 - global_seqlen/max:458260 - global_seqlen/minmax_diff:95450 - global_seqlen/balanced_min:415711 - global_seqlen/balanced_max:415832 - global_seqlen/mean:415771.5 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.20731008494191844) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011743281036615372 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.005157630124813295) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004702013891643298) - actor/ppo_kl:np.float64(3.28143993044705e-05) - actor/pg_clipfrac_lower:np.float64(4.174108772017264e-06) - actor/grad_norm:np.float64(0.23092665895819664) - perf/mfu/actor:np.float64(0.22316469496516828) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(255.6890640258789) - actor/lr:np.float64(1e-06) - training/global_step:160 - training/epoch:0 - critic/score/mean:0.5489130616188049 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.538262665271759 - critic/rewards/max:1.0096197128295898 - critic/rewards/min:-0.09252666682004929 - critic/advantages/mean:-0.16499383747577667 - critic/advantages/max:2.4747676849365234 - critic/advantages/min:-2.474832773208618 - critic/returns/mean:-0.16499383747577667 - critic/returns/max:2.4747676849365234 - critic/returns/min:-2.474832773208618 - response_length/mean:1270.4552001953125 - response_length/max:8192.0 - response_length/min:263.0 - response_length/clip_ratio:0.01766304299235344 - response_length_non_aborted/mean:1270.4552001953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:263.0 - response_length_non_aborted/clip_ratio:0.01766304299235344 - response/aborted_ratio:0.0 - prompt_length/mean:237.13043212890625 - prompt_length/max:535.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.277090430259705e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.9750916454941034) - timing_s/agent_loop/generate_sequences/max:np.float64(30.162925014272332) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.249745353569779) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.162925014272332) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.048228641971946 - timing_s/reward:0.00013268087059259415 - timing_s/old_log_prob:10.802471072413027 - timing_s/ref:25.765573617070913 - timing_s/adv:0.0668513523414731 - timing_s/update_actor:21.95068935677409 - timing_s/update_weights:31.446454784832895 - timing_s/step:122.47493620216846 - timing_s/stop_profile:5.841068923473358e-05 - timing_per_token_ms/adv:6.024907766383687e-05 - timing_per_token_ms/update_actor:0.019782827744093134 - timing_per_token_ms/gen:0.03427416423843725 - timing_per_token_ms/ref:0.023220952030691633 - perf/total_num_tokens:1663086 - perf/time_per_step:122.47493620216846 - perf/throughput:3394.7476348441537 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:143.0 - frontier/mean_score:2.1546078125 - frontier/mean_frontier_pct:0.04642370946139018 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.7 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.3 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.3 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9299999999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.09 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.51 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.9509999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.7598999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:160.0 - cluster/prob_snapshot/cluster_0:0.01450379963290883 - cluster/prob_snapshot/cluster_1:0.015156470616389727 - cluster/prob_snapshot/cluster_2:0.01080533072651708 - cluster/prob_snapshot/cluster_3:0.01450379963290883 - cluster/prob_snapshot/cluster_4:0.01450379963290883 - cluster/prob_snapshot/cluster_5:0.01080533072651708 - cluster/prob_snapshot/cluster_6:0.01080533072651708 - cluster/prob_snapshot/cluster_7:0.01450379963290883 - cluster/prob_snapshot/cluster_8:0.01820226853930058 - cluster/prob_snapshot/cluster_9:0.012328229687972505 - cluster/prob_snapshot/cluster_10:0.025599206352084083 - cluster/prob_snapshot/cluster_11:0.012328229687972505 - cluster/prob_snapshot/cluster_12:0.016679369577845153 - cluster/prob_snapshot/cluster_13:0.01450379963290883 - cluster/prob_snapshot/cluster_14:0.016679369577845153 - cluster/prob_snapshot/cluster_15:0.01450379963290883 - cluster/prob_snapshot/cluster_16:0.01820226853930058 - cluster/prob_snapshot/cluster_17:0.01450379963290883 - cluster/prob_snapshot/cluster_18:0.012328229687972505 - cluster/prob_snapshot/cluster_19:0.01450379963290883 - cluster/prob_snapshot/cluster_20:0.01450379963290883 - cluster/prob_snapshot/cluster_21:0.012328229687972505 - cluster/prob_snapshot/cluster_22:0.01080533072651708 - cluster/prob_snapshot/cluster_23:0.016679369577845153 - cluster/prob_snapshot/cluster_24:0.01450379963290883 - cluster/prob_snapshot/cluster_25:0.01450379963290883 - cluster/prob_snapshot/cluster_26:0.013851128649427932 - cluster/prob_snapshot/cluster_27:0.022553408429173232 - cluster/prob_snapshot/cluster_28:0.01450379963290883 - cluster/prob_snapshot/cluster_29:0.01450379963290883 - cluster/prob_snapshot/cluster_30:0.01450379963290883 - cluster/prob_snapshot/cluster_31:0.01926829781231938 - cluster/prob_snapshot/cluster_32:0.016679369577845153 - cluster/prob_snapshot/cluster_33:0.01450379963290883 - cluster/prob_snapshot/cluster_34:0.016679369577845153 - cluster/prob_snapshot/cluster_35:0.016679369577845153 - cluster/prob_snapshot/cluster_36:0.016679369577845153 - cluster/prob_snapshot/cluster_37:0.01820226853930058 - cluster/prob_snapshot/cluster_38:0.021248066462211435 - cluster/prob_snapshot/cluster_39:0.016679369577845153 - cluster/prob_snapshot/cluster_40:0.021030509467717805 - cluster/prob_snapshot/cluster_41:0.016679369577845153 - cluster/prob_snapshot/cluster_42:0.012328229687972505 - cluster/prob_snapshot/cluster_43:0.016679369577845153 - cluster/prob_snapshot/cluster_44:0.01450379963290883 - cluster/prob_snapshot/cluster_45:0.016679369577845153 - cluster/prob_snapshot/cluster_46:0.01450379963290883 - cluster/prob_snapshot/cluster_47:0.01450379963290883 - cluster/prob_snapshot/cluster_48:0.01450379963290883 - cluster/prob_snapshot/cluster_49:0.01926829781231938 - cluster/prob_snapshot/cluster_50:0.01450379963290883 - cluster/prob_snapshot/cluster_51:0.01450379963290883 - cluster/prob_snapshot/cluster_52:0.016679369577845153 - cluster/prob_snapshot/cluster_53:0.012328229687972505 - cluster/prob_snapshot/cluster_54:0.01820226853930058 - cluster/prob_snapshot/cluster_55:0.01450379963290883 - cluster/prob_snapshot/cluster_56:0.015156470616389727 - cluster/prob_snapshot/cluster_57:0.01820226853930058 - cluster/prob_snapshot/cluster_58:0.02140035635835698 - cluster/prob_snapshot/cluster_59:0.020014518303432538 - cluster/prob_snapshot/cluster_60:0.01450379963290883 - cluster/prob_snapshot/cluster_61:0.01450379963290883 - cluster/prob_snapshot/cluster_62:0.016679369577845153 - cluster/prob_snapshot/cluster_63:0.012328229687972505
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|██        | 161/800 [5:02:24<20:34:25, 115.91s/it]
[36m(TaskRunner pid=2823680)[0m step:161 - global_seqlen/min:323515 - global_seqlen/max:498629 - global_seqlen/minmax_diff:175114 - global_seqlen/balanced_min:420485 - global_seqlen/balanced_max:420599 - global_seqlen/mean:420542.0 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.19929958641043174) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013537367805838585 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09371548902709037) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00043123827688997034) - actor/ppo_kl:np.float64(3.3539287147979614e-05) - actor/pg_clipfrac_lower:np.float64(2.148221443823549e-06) - actor/grad_norm:np.float64(0.22084747875730196) - perf/mfu/actor:np.float64(0.25042096878513953) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(258.25959396362305) - actor/lr:np.float64(1e-06) - training/global_step:161 - training/epoch:0 - critic/score/mean:0.5461956262588501 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5339661836624146 - critic/rewards/max:1.009621262550354 - critic/rewards/min:-0.055107444524765015 - critic/advantages/mean:-0.10040438920259476 - critic/advantages/max:2.4748315811157227 - critic/advantages/min:-2.4748342037200928 - critic/returns/mean:-0.10040438920259476 - critic/returns/max:2.4748315811157227 - critic/returns/min:-2.4748342037200928 - response_length/mean:1187.702392578125 - response_length/max:8192.0 - response_length/min:207.0 - response_length/clip_ratio:0.009510869160294533 - response_length_non_aborted/mean:1187.702392578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:207.0 - response_length_non_aborted/clip_ratio:0.009510869160294533 - response/aborted_ratio:0.0 - prompt_length/mean:244.28260803222656 - prompt_length/max:667.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.6089206635952e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.068214524537325) - timing_s/agent_loop/generate_sequences/max:np.float64(30.869296859018505) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.963962794972758) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.869296859018505) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:197 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.52867973968387 - timing_s/reward:0.00022114068269729614 - timing_s/old_log_prob:9.505153912119567 - timing_s/ref:22.28982095886022 - timing_s/adv:0.07239053398370743 - timing_s/update_actor:19.804158378392458 - timing_s/update_weights:29.440238445997238 - timing_s/step:114.01688640750945 - timing_s/stop_profile:7.597915828227997e-05 - timing_per_token_ms/adv:6.868556587485203e-05 - timing_per_token_ms/update_actor:0.018790575922553975 - timing_per_token_ms/gen:0.037211825146152275 - timing_per_token_ms/ref:0.021149021585515904 - perf/total_num_tokens:1682168 - perf/time_per_step:114.01688640750945 - perf/throughput:3688.4185601853274 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:179.0 - frontier/mean_score:2.1802442187499995 - frontier/mean_frontier_pct:0.0666237147216364 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.09 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.9709999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.51 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.7 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.91 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.3 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9299999999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.51 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.3629999999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.51 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.9656999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.8319299999999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:161.0 - cluster/prob_snapshot/cluster_0:0.01433325667429889 - cluster/prob_snapshot/cluster_1:0.01497825322464234 - cluster/prob_snapshot/cluster_2:0.010678276222352673 - cluster/prob_snapshot/cluster_3:0.016483245175443723 - cluster/prob_snapshot/cluster_4:0.01433325667429889 - cluster/prob_snapshot/cluster_5:0.010678276222352673 - cluster/prob_snapshot/cluster_6:0.010678276222352673 - cluster/prob_snapshot/cluster_7:0.01433325667429889 - cluster/prob_snapshot/cluster_8:0.017988237126245105 - cluster/prob_snapshot/cluster_9:0.010678276222352673 - cluster/prob_snapshot/cluster_10:0.028458681126820443 - cluster/prob_snapshot/cluster_11:0.012183268173154056 - cluster/prob_snapshot/cluster_12:0.017988237126245105 - cluster/prob_snapshot/cluster_13:0.01433325667429889 - cluster/prob_snapshot/cluster_14:0.017988237126245105 - cluster/prob_snapshot/cluster_15:0.01433325667429889 - cluster/prob_snapshot/cluster_16:0.017988237126245105 - cluster/prob_snapshot/cluster_17:0.01433325667429889 - cluster/prob_snapshot/cluster_18:0.012183268173154056 - cluster/prob_snapshot/cluster_19:0.01433325667429889 - cluster/prob_snapshot/cluster_20:0.01433325667429889 - cluster/prob_snapshot/cluster_21:0.012183268173154056 - cluster/prob_snapshot/cluster_22:0.010678276222352673 - cluster/prob_snapshot/cluster_23:0.016483245175443723 - cluster/prob_snapshot/cluster_24:0.01433325667429889 - cluster/prob_snapshot/cluster_25:0.01433325667429889 - cluster/prob_snapshot/cluster_26:0.013688260123955439 - cluster/prob_snapshot/cluster_27:0.022288214128534774 - cluster/prob_snapshot/cluster_28:0.016483245175443723 - cluster/prob_snapshot/cluster_29:0.012183268173154056 - cluster/prob_snapshot/cluster_30:0.01433325667429889 - cluster/prob_snapshot/cluster_31:0.019041731491806074 - cluster/prob_snapshot/cluster_32:0.013688260123955439 - cluster/prob_snapshot/cluster_33:0.01433325667429889 - cluster/prob_snapshot/cluster_34:0.016483245175443723 - cluster/prob_snapshot/cluster_35:0.016483245175443723 - cluster/prob_snapshot/cluster_36:0.016483245175443723 - cluster/prob_snapshot/cluster_37:0.017988237126245105 - cluster/prob_snapshot/cluster_38:0.02099822102784787 - cluster/prob_snapshot/cluster_39:0.016483245175443723 - cluster/prob_snapshot/cluster_40:0.02078322217773339 - cluster/prob_snapshot/cluster_41:0.016483245175443723 - cluster/prob_snapshot/cluster_42:0.012183268173154056 - cluster/prob_snapshot/cluster_43:0.016483245175443723 - cluster/prob_snapshot/cluster_44:0.01433325667429889 - cluster/prob_snapshot/cluster_45:0.016483245175443723 - cluster/prob_snapshot/cluster_46:0.01433325667429889 - cluster/prob_snapshot/cluster_47:0.01433325667429889 - cluster/prob_snapshot/cluster_48:0.016483245175443723 - cluster/prob_snapshot/cluster_49:0.019041731491806074 - cluster/prob_snapshot/cluster_50:0.012183268173154056 - cluster/prob_snapshot/cluster_51:0.016483245175443723 - cluster/prob_snapshot/cluster_52:0.017988237126245105 - cluster/prob_snapshot/cluster_53:0.012183268173154056 - cluster/prob_snapshot/cluster_54:0.017988237126245105 - cluster/prob_snapshot/cluster_55:0.01433325667429889 - cluster/prob_snapshot/cluster_56:0.016934742760684136 - cluster/prob_snapshot/cluster_57:0.017988237126245105 - cluster/prob_snapshot/cluster_58:0.021254069659484107 - cluster/prob_snapshot/cluster_59:0.020295389786823624 - cluster/prob_snapshot/cluster_60:0.01433325667429889 - cluster/prob_snapshot/cluster_61:0.01433325667429889 - cluster/prob_snapshot/cluster_62:0.017988237126245105 - cluster/prob_snapshot/cluster_63:0.012183268173154056
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|██        | 162/800 [5:04:29<21:02:24, 118.72s/it]
[36m(TaskRunner pid=2823680)[0m step:162 - global_seqlen/min:370519 - global_seqlen/max:456366 - global_seqlen/minmax_diff:85847 - global_seqlen/balanced_min:406370 - global_seqlen/balanced_max:406467 - global_seqlen/mean:406419.0 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.19195805902176716) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01233102660626173 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07899462795830914) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00028101244745986595) - actor/ppo_kl:np.float64(1.414431698989613e-06) - actor/pg_clipfrac_lower:np.float64(4.3765704755601966e-06) - actor/grad_norm:np.float64(0.22904796650012335) - perf/mfu/actor:np.float64(0.22365588751202964) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(262.2708988189697) - actor/lr:np.float64(1e-06) - training/global_step:162 - training/epoch:0 - critic/score/mean:0.5651595592498779 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5541818141937256 - critic/rewards/max:1.0096635818481445 - critic/rewards/min:-0.057693298906087875 - critic/advantages/mean:-0.15364505350589752 - critic/advantages/max:2.47480845451355 - critic/advantages/min:-2.474848508834839 - critic/returns/mean:-0.15364505350589752 - critic/returns/max:2.47480845451355 - critic/returns/min:-2.474848508834839 - response_length/mean:1218.88037109375 - response_length/max:8192.0 - response_length/min:152.0 - response_length/clip_ratio:0.010638297535479069 - response_length_non_aborted/mean:1218.88037109375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:152.0 - response_length_non_aborted/clip_ratio:0.010638297535479069 - response/aborted_ratio:0.0 - prompt_length/mean:234.36170959472656 - prompt_length/max:380.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.0002445932477712631 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1136746183037758) - timing_s/agent_loop/generate_sequences/max:np.float64(29.911324301734567) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.106212767582292) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.911324301734567) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:299 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.888730906881392 - timing_s/reward:0.0013325689360499382 - timing_s/old_log_prob:10.677203255705535 - timing_s/ref:25.838113782927394 - timing_s/adv:0.115683913230896 - timing_s/update_actor:21.463042649440467 - timing_s/update_weights:34.50499934051186 - timing_s/step:125.01490759011358 - timing_s/stop_profile:7.655471563339233e-05 - timing_per_token_ms/adv:0.00010585641534325856 - timing_per_token_ms/update_actor:0.019639729447036493 - timing_per_token_ms/gen:0.034790312554556516 - timing_per_token_ms/ref:0.023643132635328745 - perf/total_num_tokens:1625676 - perf/time_per_step:125.01490759011358 - perf/throughput:3250.9642876554058 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:213.0 - frontier/mean_score:2.202017984375 - frontier/mean_frontier_pct:0.07994247174163871 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.3629999999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:1.49 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.9709999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.51 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.637 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.4769999999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.3 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.637 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9509999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.51 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.5540999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.9656999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.8823509999999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:162.0 - cluster/prob_snapshot/cluster_0:0.014191528053695578 - cluster/prob_snapshot/cluster_1:0.016767290395441323 - cluster/prob_snapshot/cluster_2:0.010572688400003204 - cluster/prob_snapshot/cluster_3:0.01632025726174991 - cluster/prob_snapshot/cluster_4:0.014191528053695578 - cluster/prob_snapshot/cluster_5:0.010572688400003204 - cluster/prob_snapshot/cluster_6:0.010572688400003204 - cluster/prob_snapshot/cluster_7:0.014191528053695578 - cluster/prob_snapshot/cluster_8:0.017810367707387947 - cluster/prob_snapshot/cluster_9:0.010572688400003204 - cluster/prob_snapshot/cluster_10:0.028177278950612568 - cluster/prob_snapshot/cluster_11:0.01206279884564124 - cluster/prob_snapshot/cluster_12:0.017810367707387947 - cluster/prob_snapshot/cluster_13:0.014191528053695578 - cluster/prob_snapshot/cluster_14:0.017810367707387947 - cluster/prob_snapshot/cluster_15:0.014191528053695578 - cluster/prob_snapshot/cluster_16:0.017810367707387947 - cluster/prob_snapshot/cluster_17:0.014191528053695578 - cluster/prob_snapshot/cluster_18:0.01206279884564124 - cluster/prob_snapshot/cluster_19:0.01632025726174991 - cluster/prob_snapshot/cluster_20:0.014191528053695578 - cluster/prob_snapshot/cluster_21:0.014830146816111877 - cluster/prob_snapshot/cluster_22:0.010572688400003204 - cluster/prob_snapshot/cluster_23:0.01632025726174991 - cluster/prob_snapshot/cluster_24:0.01632025726174991 - cluster/prob_snapshot/cluster_25:0.014191528053695578 - cluster/prob_snapshot/cluster_26:0.01161576571194983 - cluster/prob_snapshot/cluster_27:0.017576207494501967 - cluster/prob_snapshot/cluster_28:0.017810367707387947 - cluster/prob_snapshot/cluster_29:0.01206279884564124 - cluster/prob_snapshot/cluster_30:0.01632025726174991 - cluster/prob_snapshot/cluster_31:0.018853445019334572 - cluster/prob_snapshot/cluster_32:0.01161576571194983 - cluster/prob_snapshot/cluster_33:0.014191528053695578 - cluster/prob_snapshot/cluster_34:0.01632025726174991 - cluster/prob_snapshot/cluster_35:0.02206782612349662 - cluster/prob_snapshot/cluster_36:0.01632025726174991 - cluster/prob_snapshot/cluster_37:0.017810367707387947 - cluster/prob_snapshot/cluster_38:0.02093959964322782 - cluster/prob_snapshot/cluster_39:0.01632025726174991 - cluster/prob_snapshot/cluster_40:0.020577715677858585 - cluster/prob_snapshot/cluster_41:0.01632025726174991 - cluster/prob_snapshot/cluster_42:0.01206279884564124 - cluster/prob_snapshot/cluster_43:0.01632025726174991 - cluster/prob_snapshot/cluster_44:0.014191528053695578 - cluster/prob_snapshot/cluster_45:0.01632025726174991 - cluster/prob_snapshot/cluster_46:0.014191528053695578 - cluster/prob_snapshot/cluster_47:0.014191528053695578 - cluster/prob_snapshot/cluster_48:0.01632025726174991 - cluster/prob_snapshot/cluster_49:0.018853445019334572 - cluster/prob_snapshot/cluster_50:0.010572688400003204 - cluster/prob_snapshot/cluster_51:0.01632025726174991 - cluster/prob_snapshot/cluster_52:0.017810367707387947 - cluster/prob_snapshot/cluster_53:0.010572688400003204 - cluster/prob_snapshot/cluster_54:0.017810367707387947 - cluster/prob_snapshot/cluster_55:0.014191528053695578 - cluster/prob_snapshot/cluster_56:0.018123290900971933 - cluster/prob_snapshot/cluster_57:0.018853445019334572 - cluster/prob_snapshot/cluster_58:0.021043907374422483 - cluster/prob_snapshot/cluster_59:0.020452482538548748 - cluster/prob_snapshot/cluster_60:0.014191528053695578 - cluster/prob_snapshot/cluster_61:0.014191528053695578 - cluster/prob_snapshot/cluster_62:0.017810367707387947 - cluster/prob_snapshot/cluster_63:0.01206279884564124
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|██        | 163/800 [5:06:28<20:59:13, 118.61s/it]
[36m(TaskRunner pid=2823680)[0m step:163 - global_seqlen/min:327168 - global_seqlen/max:493095 - global_seqlen/minmax_diff:165927 - global_seqlen/balanced_min:371913 - global_seqlen/balanced_max:372043 - global_seqlen/mean:371970.5 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.227822642111116) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013594805262982845 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.01685578427350265) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004342375469099756) - actor/ppo_kl:np.float64(-1.7100265318327324e-06) - actor/pg_clipfrac_lower:np.float64(2.5040816480112777e-07) - actor/grad_norm:np.float64(0.23123054082194963) - perf/mfu/actor:np.float64(0.2198363269221692) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(259.69325733184814) - actor/lr:np.float64(1e-06) - training/global_step:163 - training/epoch:0 - critic/score/mean:0.5458333492279053 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5353776812553406 - critic/rewards/max:1.0062183141708374 - critic/rewards/min:-0.07021671533584595 - critic/advantages/mean:-0.16639924049377441 - critic/advantages/max:2.4748504161834717 - critic/advantages/min:-2.4748542308807373 - critic/returns/mean:-0.16639924049377441 - critic/returns/max:2.4748504161834717 - critic/returns/min:-2.4748542308807373 - response_length/mean:1094.75830078125 - response_length/max:8192.0 - response_length/min:199.0 - response_length/clip_ratio:0.011111111380159855 - response_length_non_aborted/mean:1094.75830078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:199.0 - response_length_non_aborted/clip_ratio:0.011111111380159855 - response/aborted_ratio:0.0 - prompt_length/mean:243.08888244628906 - prompt_length/max:355.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00013405457139015198 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.581163028255105) - timing_s/agent_loop/generate_sequences/max:np.float64(30.090947141870856) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.476409648873414) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.090947141870856) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:221 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.117197904735804 - timing_s/reward:0.00019073300063610077 - timing_s/old_log_prob:10.894705845974386 - timing_s/ref:23.179108834825456 - timing_s/adv:0.0698370635509491 - timing_s/update_actor:19.967216080985963 - timing_s/update_weights:31.447063580155373 - timing_s/step:118.10109937749803 - timing_s/stop_profile:6.42361119389534e-05 - timing_per_token_ms/adv:7.250149343467335e-05 - timing_per_token_ms/update_actor:0.02072900709160235 - timing_per_token_ms/gen:0.0407461792743906 - timing_per_token_ms/ref:0.02406344026454758 - perf/total_num_tokens:1487882 - perf/time_per_step:118.10109937749803 - perf/throughput:3149.5938815187023 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:251.0 - frontier/mean_score:2.1938943078124997 - frontier/mean_frontier_pct:0.10567065435526812 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9540999999999997 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.343 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.2797 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.0569999999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.51 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.637 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.4769999999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.3 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.637 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.6569999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:1.49 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.5540999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.9176456999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.49 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:163.0 - cluster/prob_snapshot/cluster_0:0.01424407725053944 - cluster/prob_snapshot/cluster_1:0.013917175677639558 - cluster/prob_snapshot/cluster_2:0.009564897873737235 - cluster/prob_snapshot/cluster_3:0.016380688838120355 - cluster/prob_snapshot/cluster_4:0.01424407725053944 - cluster/prob_snapshot/cluster_5:0.009564897873737235 - cluster/prob_snapshot/cluster_6:0.010611837551651883 - cluster/prob_snapshot/cluster_7:0.01424407725053944 - cluster/prob_snapshot/cluster_8:0.017876316949426995 - cluster/prob_snapshot/cluster_9:0.010611837551651883 - cluster/prob_snapshot/cluster_10:0.030480188704566823 - cluster/prob_snapshot/cluster_11:0.012107465662958524 - cluster/prob_snapshot/cluster_12:0.01465003345217981 - cluster/prob_snapshot/cluster_13:0.01424407725053944 - cluster/prob_snapshot/cluster_14:0.017876316949426995 - cluster/prob_snapshot/cluster_15:0.01424407725053944 - cluster/prob_snapshot/cluster_16:0.017876316949426995 - cluster/prob_snapshot/cluster_17:0.01424407725053944 - cluster/prob_snapshot/cluster_18:0.012107465662958524 - cluster/prob_snapshot/cluster_19:0.016380688838120355 - cluster/prob_snapshot/cluster_20:0.016380688838120355 - cluster/prob_snapshot/cluster_21:0.014885060726813714 - cluster/prob_snapshot/cluster_22:0.010611837551651883 - cluster/prob_snapshot/cluster_23:0.016380688838120355 - cluster/prob_snapshot/cluster_24:0.017876316949426995 - cluster/prob_snapshot/cluster_25:0.01424407725053944 - cluster/prob_snapshot/cluster_26:0.011658777229566533 - cluster/prob_snapshot/cluster_27:0.017641289674793094 - cluster/prob_snapshot/cluster_28:0.017876316949426995 - cluster/prob_snapshot/cluster_29:0.012107465662958524 - cluster/prob_snapshot/cluster_30:0.016380688838120355 - cluster/prob_snapshot/cluster_31:0.018923256627341643 - cluster/prob_snapshot/cluster_32:0.011658777229566533 - cluster/prob_snapshot/cluster_33:0.01424407725053944 - cluster/prob_snapshot/cluster_34:0.016380688838120355 - cluster/prob_snapshot/cluster_35:0.022149540124588828 - cluster/prob_snapshot/cluster_36:0.016380688838120355 - cluster/prob_snapshot/cluster_37:0.017876316949426995 - cluster/prob_snapshot/cluster_38:0.021121829950962408 - cluster/prob_snapshot/cluster_39:0.017876316949426995 - cluster/prob_snapshot/cluster_40:0.020653912013282188 - cluster/prob_snapshot/cluster_41:0.017876316949426995 - cluster/prob_snapshot/cluster_42:0.012107465662958524 - cluster/prob_snapshot/cluster_43:0.016380688838120355 - cluster/prob_snapshot/cluster_44:0.01424407725053944 - cluster/prob_snapshot/cluster_45:0.016380688838120355 - cluster/prob_snapshot/cluster_46:0.01424407725053944 - cluster/prob_snapshot/cluster_47:0.01424407725053944 - cluster/prob_snapshot/cluster_48:0.016380688838120355 - cluster/prob_snapshot/cluster_49:0.018923256627341643 - cluster/prob_snapshot/cluster_50:0.010611837551651883 - cluster/prob_snapshot/cluster_51:0.016380688838120355 - cluster/prob_snapshot/cluster_52:0.018923256627341643 - cluster/prob_snapshot/cluster_53:0.010611837551651883 - cluster/prob_snapshot/cluster_54:0.017876316949426995 - cluster/prob_snapshot/cluster_55:0.01424407725053944 - cluster/prob_snapshot/cluster_56:0.01819039885280139 - cluster/prob_snapshot/cluster_57:0.018923256627341643 - cluster/prob_snapshot/cluster_58:0.021195115728416432 - cluster/prob_snapshot/cluster_59:0.020779585370252106 - cluster/prob_snapshot/cluster_60:0.012107465662958524 - cluster/prob_snapshot/cluster_61:0.012107465662958524 - cluster/prob_snapshot/cluster_62:0.017876316949426995 - cluster/prob_snapshot/cluster_63:0.010611837551651883
[36m(TaskRunner pid=2823680)[0m Training Progress:  20%|██        | 164/800 [5:08:33<21:17:50, 120.55s/it]
[36m(TaskRunner pid=2823680)[0m step:164 - global_seqlen/min:394924 - global_seqlen/max:477190 - global_seqlen/minmax_diff:82266 - global_seqlen/balanced_min:422431 - global_seqlen/balanced_max:422648 - global_seqlen/mean:422534.25 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.22839818837209827) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011555947363376617 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06476131266390439) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003275933795088176) - actor/ppo_kl:np.float64(-1.7652611660423645e-05) - actor/pg_clipfrac_lower:np.float64(1.1637487962919597e-06) - actor/grad_norm:np.float64(0.21351919390938498) - perf/mfu/actor:np.float64(0.24633742532483444) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(258.52409648895264) - actor/lr:np.float64(1e-06) - training/global_step:164 - training/epoch:0 - critic/score/mean:0.538690447807312 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5274884104728699 - critic/rewards/max:1.0003137588500977 - critic/rewards/min:-0.07394792139530182 - critic/advantages/mean:-0.20037376880645752 - critic/advantages/max:2.4747848510742188 - critic/advantages/min:-2.4748408794403076 - critic/returns/mean:-0.20037376880645752 - critic/returns/max:2.4747848510742188 - critic/returns/min:-2.4748408794403076 - response_length/mean:1272.299072265625 - response_length/max:8192.0 - response_length/min:196.0 - response_length/clip_ratio:0.014880952425301075 - response_length_non_aborted/mean:1272.299072265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:196.0 - response_length_non_aborted/clip_ratio:0.014880952425301075 - response/aborted_ratio:0.0 - prompt_length/mean:236.65475463867188 - prompt_length/max:358.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.000208228826523e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.554841948673129) - timing_s/agent_loop/generate_sequences/max:np.float64(30.050916293635964) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.78407646456435) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.050916293635964) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:203 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.29858411569148 - timing_s/reward:0.00014219246804714203 - timing_s/old_log_prob:11.543280510231853 - timing_s/ref:25.733921396546066 - timing_s/adv:0.06865677516907454 - timing_s/update_actor:20.302361713722348 - timing_s/update_weights:34.43954699579626 - timing_s/step:124.82506369054317 - timing_s/stop_profile:5.8506615459918976e-05 - timing_per_token_ms/adv:6.770771611232803e-05 - timing_per_token_ms/update_actor:0.020021717302296065 - timing_per_token_ms/gen:0.03777678452334425 - timing_per_token_ms/ref:0.025378195233951763 - perf/total_num_tokens:1690137 - perf/time_per_step:124.82506369054317 - perf/throughput:3385.0112910618245 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:295.0 - frontier/mean_score:2.2069365935937495 - frontier/mean_frontier_pct:0.12077673589446022 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:1.9540999999999997 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.2401 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:1.49 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.2797 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.0569999999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.9 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.51 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.637 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.4769999999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.3 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:2.7598999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.637 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.6569999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.5540999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.9423519899999997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.49 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:164.0 - cluster/prob_snapshot/cluster_0:0.014159899333180601 - cluster/prob_snapshot/cluster_1:0.013834929643484104 - cluster/prob_snapshot/cluster_2:0.009508372402230774 - cluster/prob_snapshot/cluster_3:0.01628388423315769 - cluster/prob_snapshot/cluster_4:0.014159899333180601 - cluster/prob_snapshot/cluster_5:0.008779845581538631 - cluster/prob_snapshot/cluster_6:0.010549125003219547 - cluster/prob_snapshot/cluster_7:0.01628388423315769 - cluster/prob_snapshot/cluster_8:0.014563456464176245 - cluster/prob_snapshot/cluster_9:0.010549125003219547 - cluster/prob_snapshot/cluster_10:0.030300060588106508 - cluster/prob_snapshot/cluster_11:0.01203591443320351 - cluster/prob_snapshot/cluster_12:0.014563456464176245 - cluster/prob_snapshot/cluster_13:0.014159899333180601 - cluster/prob_snapshot/cluster_14:0.017770673663141653 - cluster/prob_snapshot/cluster_15:0.02053185403311187 - cluster/prob_snapshot/cluster_16:0.017770673663141653 - cluster/prob_snapshot/cluster_17:0.014159899333180601 - cluster/prob_snapshot/cluster_18:0.01203591443320351 - cluster/prob_snapshot/cluster_19:0.017770673663141653 - cluster/prob_snapshot/cluster_20:0.01628388423315769 - cluster/prob_snapshot/cluster_21:0.014797094803173727 - cluster/prob_snapshot/cluster_22:0.009508372402230774 - cluster/prob_snapshot/cluster_23:0.01628388423315769 - cluster/prob_snapshot/cluster_24:0.017770673663141653 - cluster/prob_snapshot/cluster_25:0.014159899333180601 - cluster/prob_snapshot/cluster_26:0.011589877604208322 - cluster/prob_snapshot/cluster_27:0.01753703532414417 - cluster/prob_snapshot/cluster_28:0.017770673663141653 - cluster/prob_snapshot/cluster_29:0.01203591443320351 - cluster/prob_snapshot/cluster_30:0.01628388423315769 - cluster/prob_snapshot/cluster_31:0.019539953084822568 - cluster/prob_snapshot/cluster_32:0.011589877604208322 - cluster/prob_snapshot/cluster_33:0.014159899333180601 - cluster/prob_snapshot/cluster_34:0.01628388423315769 - cluster/prob_snapshot/cluster_35:0.022018643463095833 - cluster/prob_snapshot/cluster_36:0.01628388423315769 - cluster/prob_snapshot/cluster_37:0.017770673663141653 - cluster/prob_snapshot/cluster_38:0.02099700672620685 - cluster/prob_snapshot/cluster_39:0.018811426264130425 - cluster/prob_snapshot/cluster_40:0.02053185403311187 - cluster/prob_snapshot/cluster_41:0.018811426264130425 - cluster/prob_snapshot/cluster_42:0.010549125003219547 - cluster/prob_snapshot/cluster_43:0.017770673663141653 - cluster/prob_snapshot/cluster_44:0.014159899333180601 - cluster/prob_snapshot/cluster_45:0.01628388423315769 - cluster/prob_snapshot/cluster_46:0.014159899333180601 - cluster/prob_snapshot/cluster_47:0.014159899333180601 - cluster/prob_snapshot/cluster_48:0.01628388423315769 - cluster/prob_snapshot/cluster_49:0.018811426264130425 - cluster/prob_snapshot/cluster_50:0.009508372402230774 - cluster/prob_snapshot/cluster_51:0.013522703863187473 - cluster/prob_snapshot/cluster_52:0.018811426264130425 - cluster/prob_snapshot/cluster_53:0.009508372402230774 - cluster/prob_snapshot/cluster_54:0.017770673663141653 - cluster/prob_snapshot/cluster_55:0.014159899333180601 - cluster/prob_snapshot/cluster_56:0.018082899443438282 - cluster/prob_snapshot/cluster_57:0.018811426264130425 - cluster/prob_snapshot/cluster_58:0.021069859408276066 - cluster/prob_snapshot/cluster_59:0.020831703990591803 - cluster/prob_snapshot/cluster_60:0.01203591443320351 - cluster/prob_snapshot/cluster_61:0.014797094803173727 - cluster/prob_snapshot/cluster_62:0.017770673663141653 - cluster/prob_snapshot/cluster_63:0.010549125003219547
[36m(TaskRunner pid=2823680)[0m Training Progress:  21%|██        | 165/800 [5:10:23<20:42:43, 117.42s/it]
[36m(TaskRunner pid=2823680)[0m step:165 - global_seqlen/min:324637 - global_seqlen/max:401037 - global_seqlen/minmax_diff:76400 - global_seqlen/balanced_min:366216 - global_seqlen/balanced_max:366420 - global_seqlen/mean:366291.25 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.22166748611512138) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011921207420527935 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.12175982017652132) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00035346906738328366) - actor/ppo_kl:np.float64(4.946427999300061e-05) - actor/pg_clipfrac_lower:np.float64(1.8770996950265459e-06) - actor/grad_norm:np.float64(0.25306782814172596) - perf/mfu/actor:np.float64(0.18433740059705797) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(260.0869026184082) - actor/lr:np.float64(1e-06) - training/global_step:165 - training/epoch:0 - critic/score/mean:0.6066176295280457 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5962080955505371 - critic/rewards/max:1.004404067993164 - critic/rewards/min:-0.06031949073076248 - critic/advantages/mean:-0.16039817035198212 - critic/advantages/max:2.474839925765991 - critic/advantages/min:-2.474853277206421 - critic/returns/mean:-0.16039817035198212 - critic/returns/max:2.474839925765991 - critic/returns/min:-2.474853277206421 - response_length/mean:1141.2071533203125 - response_length/max:8192.0 - response_length/min:176.0 - response_length/clip_ratio:0.00857843179255724 - response_length_non_aborted/mean:1141.2071533203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:176.0 - response_length_non_aborted/clip_ratio:0.00857843179255724 - response/aborted_ratio:0.0 - prompt_length/mean:241.3333282470703 - prompt_length/max:381.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.279325604438782e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.482999193482101) - timing_s/agent_loop/generate_sequences/max:np.float64(28.534409570507705) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.848346093195687) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.534409570507705) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:229 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.85117490030825 - timing_s/reward:0.00023835338652133942 - timing_s/old_log_prob:10.701971651986241 - timing_s/ref:17.840135522186756 - timing_s/adv:0.11492261663079262 - timing_s/update_actor:23.229000722058117 - timing_s/update_weights:26.708785951137543 - timing_s/step:109.88212459068745 - timing_s/stop_profile:6.666779518127441e-05 - timing_per_token_ms/adv:0.00010186793513893294 - timing_per_token_ms/update_actor:0.020590292914221844 - timing_per_token_ms/gen:0.033129667803493516 - timing_per_token_ms/ref:0.015813578053851522 - perf/total_num_tokens:1465165 - perf/time_per_step:109.88212459068745 - perf/throughput:3333.4926073229867 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:321.0 - frontier/mean_score:2.25791268734375 - frontier/mean_frontier_pct:0.131341866521282 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.9 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:2.2678699999999994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.2401 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.51 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.2797 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.0569999999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.9299999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.51 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.51 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.637 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.4769999999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:2.09 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.3 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:2.8319299999999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.637 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.3 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.6569999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.343 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.6569999999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.6878699999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.9423519899999997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:165.0 - cluster/prob_snapshot/cluster_0:0.020068313648260002 - cluster/prob_snapshot/cluster_1:0.01569390568051014 - cluster/prob_snapshot/cluster_2:0.009293705251590753 - cluster/prob_snapshot/cluster_3:0.015916248755516553 - cluster/prob_snapshot/cluster_4:0.01384021630914483 - cluster/prob_snapshot/cluster_5:0.008581626122485251 - cluster/prob_snapshot/cluster_6:0.010310961150312898 - cluster/prob_snapshot/cluster_7:0.01736947146797676 - cluster/prob_snapshot/cluster_8:0.014234662473955453 - cluster/prob_snapshot/cluster_9:0.009293705251590753 - cluster/prob_snapshot/cluster_10:0.029615986869123565 - cluster/prob_snapshot/cluster_11:0.011764183862773106 - cluster/prob_snapshot/cluster_12:0.014234662473955453 - cluster/prob_snapshot/cluster_13:0.01384021630914483 - cluster/prob_snapshot/cluster_14:0.01736947146797676 - cluster/prob_snapshot/cluster_15:0.020275916892897174 - cluster/prob_snapshot/cluster_16:0.01736947146797676 - cluster/prob_snapshot/cluster_17:0.015916248755516553 - cluster/prob_snapshot/cluster_18:0.011764183862773106 - cluster/prob_snapshot/cluster_19:0.018386727366698902 - cluster/prob_snapshot/cluster_20:0.015916248755516553 - cluster/prob_snapshot/cluster_21:0.014463026043056345 - cluster/prob_snapshot/cluster_22:0.009293705251590753 - cluster/prob_snapshot/cluster_23:0.01736947146797676 - cluster/prob_snapshot/cluster_24:0.01736947146797676 - cluster/prob_snapshot/cluster_25:0.01384021630914483 - cluster/prob_snapshot/cluster_26:0.011328217049035043 - cluster/prob_snapshot/cluster_27:0.01714110789887587 - cluster/prob_snapshot/cluster_28:0.01736947146797676 - cluster/prob_snapshot/cluster_29:0.014463026043056345 - cluster/prob_snapshot/cluster_30:0.015916248755516553 - cluster/prob_snapshot/cluster_31:0.019597261886178254 - cluster/prob_snapshot/cluster_32:0.011328217049035043 - cluster/prob_snapshot/cluster_33:0.01384021630914483 - cluster/prob_snapshot/cluster_34:0.015916248755516553 - cluster/prob_snapshot/cluster_35:0.021521536360720208 - cluster/prob_snapshot/cluster_36:0.015916248755516553 - cluster/prob_snapshot/cluster_37:0.01736947146797676 - cluster/prob_snapshot/cluster_38:0.020522964754015407 - cluster/prob_snapshot/cluster_39:0.018386727366698902 - cluster/prob_snapshot/cluster_40:0.020068313648260002 - cluster/prob_snapshot/cluster_41:0.019098806495804404 - cluster/prob_snapshot/cluster_42:0.010310961150312898 - cluster/prob_snapshot/cluster_43:0.01736947146797676 - cluster/prob_snapshot/cluster_44:0.015916248755516553 - cluster/prob_snapshot/cluster_45:0.015916248755516553 - cluster/prob_snapshot/cluster_46:0.01384021630914483 - cluster/prob_snapshot/cluster_47:0.01384021630914483 - cluster/prob_snapshot/cluster_48:0.015916248755516553 - cluster/prob_snapshot/cluster_49:0.018386727366698902 - cluster/prob_snapshot/cluster_50:0.009293705251590753 - cluster/prob_snapshot/cluster_51:0.013217406575233312 - cluster/prob_snapshot/cluster_52:0.018386727366698902 - cluster/prob_snapshot/cluster_53:0.009293705251590753 - cluster/prob_snapshot/cluster_54:0.018386727366698902 - cluster/prob_snapshot/cluster_55:0.015916248755516553 - cluster/prob_snapshot/cluster_56:0.01860035110543055 - cluster/prob_snapshot/cluster_57:0.018386727366698902 - cluster/prob_snapshot/cluster_58:0.020594172666925956 - cluster/prob_snapshot/cluster_59:0.02036139399962137 - cluster/prob_snapshot/cluster_60:0.011764183862773106 - cluster/prob_snapshot/cluster_61:0.014463026043056345 - cluster/prob_snapshot/cluster_62:0.01736947146797676 - cluster/prob_snapshot/cluster_63:0.009293705251590753
[36m(TaskRunner pid=2823680)[0m Training Progress:  21%|██        | 166/800 [5:12:37<21:33:23, 122.40s/it]
[36m(TaskRunner pid=2823680)[0m step:166 - global_seqlen/min:361986 - global_seqlen/max:449231 - global_seqlen/minmax_diff:87245 - global_seqlen/balanced_min:400196 - global_seqlen/balanced_max:400271 - global_seqlen/mean:400234.5 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.21747280284762383) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011929858475923538 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.09150089417198615) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003685230083795673) - actor/ppo_kl:np.float64(2.180567654488873e-05) - actor/pg_clipfrac_lower:np.float64(4.1574675360236e-06) - actor/grad_norm:np.float64(0.2098141020307174) - perf/mfu/actor:np.float64(0.19037754810060334) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(260.1168899536133) - actor/lr:np.float64(1e-06) - training/global_step:166 - training/epoch:0 - critic/score/mean:0.5982142686843872 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.587380588054657 - critic/rewards/max:1.0533475875854492 - critic/rewards/min:-0.046532683074474335 - critic/advantages/mean:-0.1289559304714203 - critic/advantages/max:2.4748318195343018 - critic/advantages/min:-2.474832534790039 - critic/returns/mean:-0.1289559304714203 - critic/returns/max:2.4748318195343018 - critic/returns/min:-2.474832534790039 - response_length/mean:1277.64794921875 - response_length/max:8192.0 - response_length/min:64.0 - response_length/clip_ratio:0.01785714365541935 - response_length_non_aborted/mean:1277.64794921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:64.0 - response_length_non_aborted/clip_ratio:0.01785714365541935 - response/aborted_ratio:0.0 - prompt_length/mean:244.83673095703125 - prompt_length/max:555.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.408034384250641e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.6165370708331466) - timing_s/agent_loop/generate_sequences/max:np.float64(29.864818847738206) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.131587020878214) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.864818847738206) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.04970036353916 - timing_s/reward:0.00023396499454975128 - timing_s/old_log_prob:11.738233288750052 - timing_s/ref:27.91177397966385 - timing_s/adv:0.09632622450590134 - timing_s/update_actor:24.671880876645446 - timing_s/update_weights:36.83736250642687 - timing_s/step:133.77573246881366 - timing_s/stop_profile:6.755441427230835e-05 - timing_per_token_ms/adv:8.070037273413605e-05 - timing_per_token_ms/update_actor:0.020669656607121688 - timing_per_token_ms/gen:0.03199607494193647 - timing_per_token_ms/ref:0.023383980586634906 - perf/total_num_tokens:1600938 - perf/time_per_step:133.77573246881366 - perf/throughput:2991.831871249924 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:351.0 - frontier/mean_score:2.28489079671875 - frontier/mean_frontier_pct:0.15347654742436195 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.53 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.4875089999999993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.51 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.2797 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:1.7 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.0569999999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.9299999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.51 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.637 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.4769999999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:2.09 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.3 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.8823509999999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.4459 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.91 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4319299999999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.3 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.6569999999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7815089999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.9423519899999997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:166.0 - cluster/prob_snapshot/cluster_0:0.02413955628829523 - cluster/prob_snapshot/cluster_1:0.01701058456746203 - cluster/prob_snapshot/cluster_2:0.009183972831495892 - cluster/prob_snapshot/cluster_3:0.017164387049184427 - cluster/prob_snapshot/cluster_4:0.013676802429629025 - cluster/prob_snapshot/cluster_5:0.012090772035877094 - cluster/prob_snapshot/cluster_6:0.010189217810073625 - cluster/prob_snapshot/cluster_7:0.017164387049184427 - cluster/prob_snapshot/cluster_8:0.01406659129887345 - cluster/prob_snapshot/cluster_9:0.009183972831495892 - cluster/prob_snapshot/cluster_10:0.029266305679041673 - cluster/prob_snapshot/cluster_11:0.01162528206518467 - cluster/prob_snapshot/cluster_12:0.01406659129887345 - cluster/prob_snapshot/cluster_13:0.01572832279407338 - cluster/prob_snapshot/cluster_14:0.017164387049184427 - cluster/prob_snapshot/cluster_15:0.020036515559406522 - cluster/prob_snapshot/cluster_16:0.017164387049184427 - cluster/prob_snapshot/cluster_17:0.01572832279407338 - cluster/prob_snapshot/cluster_18:0.01162528206518467 - cluster/prob_snapshot/cluster_19:0.01816963202776216 - cluster/prob_snapshot/cluster_20:0.01572832279407338 - cluster/prob_snapshot/cluster_21:0.014292258538962332 - cluster/prob_snapshot/cluster_22:0.009183972831495892 - cluster/prob_snapshot/cluster_23:0.01816963202776216 - cluster/prob_snapshot/cluster_24:0.017164387049184427 - cluster/prob_snapshot/cluster_25:0.01162528206518467 - cluster/prob_snapshot/cluster_26:0.011194462788651358 - cluster/prob_snapshot/cluster_27:0.016938719809095545 - cluster/prob_snapshot/cluster_28:0.017164387049184427 - cluster/prob_snapshot/cluster_29:0.014292258538962332 - cluster/prob_snapshot/cluster_30:0.01572832279407338 - cluster/prob_snapshot/cluster_31:0.01971067257992182 - cluster/prob_snapshot/cluster_32:0.009887644316500304 - cluster/prob_snapshot/cluster_33:0.013676802429629025 - cluster/prob_snapshot/cluster_34:0.013061346320295718 - cluster/prob_snapshot/cluster_35:0.021041760537984254 - cluster/prob_snapshot/cluster_36:0.01572832279407338 - cluster/prob_snapshot/cluster_37:0.017164387049184427 - cluster/prob_snapshot/cluster_38:0.0202806464827754 - cluster/prob_snapshot/cluster_39:0.01816963202776216 - cluster/prob_snapshot/cluster_40:0.019831363522962088 - cluster/prob_snapshot/cluster_41:0.023468914281158368 - cluster/prob_snapshot/cluster_42:0.010189217810073625 - cluster/prob_snapshot/cluster_43:0.017164387049184427 - cluster/prob_snapshot/cluster_44:0.01572832279407338 - cluster/prob_snapshot/cluster_45:0.01572832279407338 - cluster/prob_snapshot/cluster_46:0.013676802429629025 - cluster/prob_snapshot/cluster_47:0.013676802429629025 - cluster/prob_snapshot/cluster_48:0.01572832279407338 - cluster/prob_snapshot/cluster_49:0.01816963202776216 - cluster/prob_snapshot/cluster_50:0.009183972831495892 - cluster/prob_snapshot/cluster_51:0.013061346320295718 - cluster/prob_snapshot/cluster_52:0.014770262783877863 - cluster/prob_snapshot/cluster_53:0.012583342075380184 - cluster/prob_snapshot/cluster_54:0.01816963202776216 - cluster/prob_snapshot/cluster_55:0.01572832279407338 - cluster/prob_snapshot/cluster_56:0.019021074524617498 - cluster/prob_snapshot/cluster_57:0.01816963202776216 - cluster/prob_snapshot/cluster_58:0.02035101363127584 - cluster/prob_snapshot/cluster_59:0.020120983422827896 - cluster/prob_snapshot/cluster_60:0.010189217810073625 - cluster/prob_snapshot/cluster_61:0.014292258538962332 - cluster/prob_snapshot/cluster_62:0.017164387049184427 - cluster/prob_snapshot/cluster_63:0.009183972831495892
[36m(TaskRunner pid=2823680)[0m Training Progress:  21%|██        | 167/800 [5:14:38<21:26:41, 121.96s/it]
[36m(TaskRunner pid=2823680)[0m step:167 - global_seqlen/min:351012 - global_seqlen/max:399779 - global_seqlen/minmax_diff:48767 - global_seqlen/balanced_min:374657 - global_seqlen/balanced_max:374785 - global_seqlen/mean:374707.75 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.22428892477776627) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012320653535425663 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.0012530835701909382) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00036715327871429416) - actor/ppo_kl:np.float64(2.3470917854559288e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2278209557900062) - perf/mfu/actor:np.float64(0.2168080383061519) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(259.0511169433594) - actor/lr:np.float64(1e-06) - training/global_step:167 - training/epoch:0 - critic/score/mean:0.5956632494926453 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.584972620010376 - critic/rewards/max:1.028486967086792 - critic/rewards/min:-0.043813008815050125 - critic/advantages/mean:-0.10700186342000961 - critic/advantages/max:2.474813222885132 - critic/advantages/min:-2.474846601486206 - critic/returns/mean:-0.10700186342000961 - critic/returns/max:2.474813222885132 - critic/returns/min:-2.474846601486206 - response_length/mean:1119.376220703125 - response_length/max:8192.0 - response_length/min:172.0 - response_length/clip_ratio:0.011479591950774193 - response_length_non_aborted/mean:1119.376220703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:172.0 - response_length_non_aborted/clip_ratio:0.011479591950774193 - response/aborted_ratio:0.0 - prompt_length/mean:235.948974609375 - prompt_length/max:401.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.838996291160583e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.9020661748945713) - timing_s/agent_loop/generate_sequences/max:np.float64(28.805224644951522) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.335876762335829) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.805224644951522) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:222 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.612496766261756 - timing_s/reward:0.000490047037601471 - timing_s/old_log_prob:12.021855164319277 - timing_s/ref:25.89224050939083 - timing_s/adv:0.11816422548145056 - timing_s/update_actor:20.327721188776195 - timing_s/update_weights:30.037410400807858 - timing_s/step:120.55357684474438 - timing_s/stop_profile:8.109398186206818e-05 - timing_per_token_ms/adv:0.00011120553888567918 - timing_per_token_ms/update_actor:0.019130622486672653 - timing_per_token_ms/gen:0.03602190173584478 - timing_per_token_ms/ref:0.02436744748313374 - perf/total_num_tokens:1498831 - perf/time_per_step:120.55357684474438 - perf/throughput:3108.2259009416994 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:381.0 - frontier/mean_score:2.3327068920781246 - frontier/mean_frontier_pct:0.15945375300654713 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:15.0 - frontier/batch_hard_count:0.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.53 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.4875089999999993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.6569999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:3.89579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.0569999999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.5509999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.6569999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.09 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.6569999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.637 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.4769999999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:2.09 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.51 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.9121299999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.3023509999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.51 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.3 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:1.8400999999999998 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.7598999999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7815089999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:2.9596463929999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.9429999999999998 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:167.0 - cluster/prob_snapshot/cluster_0:0.02364474087477972 - cluster/prob_snapshot/cluster_1:0.016661899639853376 - cluster/prob_snapshot/cluster_2:0.008995718695419026 - cluster/prob_snapshot/cluster_3:0.01681254946053742 - cluster/prob_snapshot/cluster_4:0.013396453753416273 - cluster/prob_snapshot/cluster_5:0.011842933993901354 - cluster/prob_snapshot/cluster_6:0.009980358046295122 - cluster/prob_snapshot/cluster_7:0.017797188811413515 - cluster/prob_snapshot/cluster_8:0.013778252685388633 - cluster/prob_snapshot/cluster_9:0.008995718695419026 - cluster/prob_snapshot/cluster_10:0.02609488528401079 - cluster/prob_snapshot/cluster_11:0.013999294172320003 - cluster/prob_snapshot/cluster_12:0.013778252685388633 - cluster/prob_snapshot/cluster_13:0.015405921816428712 - cluster/prob_snapshot/cluster_14:0.01681254946053742 - cluster/prob_snapshot/cluster_15:0.02378540363919059 - cluster/prob_snapshot/cluster_16:0.017797188811413515 - cluster/prob_snapshot/cluster_17:0.015405921816428712 - cluster/prob_snapshot/cluster_18:0.011386985690403832 - cluster/prob_snapshot/cluster_19:0.017797188811413515 - cluster/prob_snapshot/cluster_20:0.015405921816428712 - cluster/prob_snapshot/cluster_21:0.013999294172320003 - cluster/prob_snapshot/cluster_22:0.008995718695419026 - cluster/prob_snapshot/cluster_23:0.017797188811413515 - cluster/prob_snapshot/cluster_24:0.017797188811413515 - cluster/prob_snapshot/cluster_25:0.011386985690403832 - cluster/prob_snapshot/cluster_26:0.010964997397171218 - cluster/prob_snapshot/cluster_27:0.01659150797360605 - cluster/prob_snapshot/cluster_28:0.01681254946053742 - cluster/prob_snapshot/cluster_29:0.013999294172320003 - cluster/prob_snapshot/cluster_30:0.01681254946053742 - cluster/prob_snapshot/cluster_31:0.01954305284445192 - cluster/prob_snapshot/cluster_32:0.012807880557759927 - cluster/prob_snapshot/cluster_33:0.013396453753416273 - cluster/prob_snapshot/cluster_34:0.0149839335231961 - cluster/prob_snapshot/cluster_35:0.02061044409963093 - cluster/prob_snapshot/cluster_36:0.015405921816428712 - cluster/prob_snapshot/cluster_37:0.01681254946053742 - cluster/prob_snapshot/cluster_38:0.019864931448253315 - cluster/prob_snapshot/cluster_39:0.017797188811413515 - cluster/prob_snapshot/cluster_40:0.019424857942453595 - cluster/prob_snapshot/cluster_41:0.022119896224523986 - cluster/prob_snapshot/cluster_42:0.009980358046295122 - cluster/prob_snapshot/cluster_43:0.01681254946053742 - cluster/prob_snapshot/cluster_44:0.01681254946053742 - cluster/prob_snapshot/cluster_45:0.015405921816428712 - cluster/prob_snapshot/cluster_46:0.013396453753416273 - cluster/prob_snapshot/cluster_47:0.015405921816428712 - cluster/prob_snapshot/cluster_48:0.015405921816428712 - cluster/prob_snapshot/cluster_49:0.017797188811413515 - cluster/prob_snapshot/cluster_50:0.008995718695419026 - cluster/prob_snapshot/cluster_51:0.01279361333451254 - cluster/prob_snapshot/cluster_52:0.0144675002310019 - cluster/prob_snapshot/cluster_53:0.01232540727583064 - cluster/prob_snapshot/cluster_54:0.01848643635702678 - cluster/prob_snapshot/cluster_55:0.015405921816428712 - cluster/prob_snapshot/cluster_56:0.018631178341605565 - cluster/prob_snapshot/cluster_57:0.017797188811413515 - cluster/prob_snapshot/cluster_58:0.019933856202814643 - cluster/prob_snapshot/cluster_59:0.019824383015144886 - cluster/prob_snapshot/cluster_60:0.013014654821443908 - cluster/prob_snapshot/cluster_61:0.013999294172320003 - cluster/prob_snapshot/cluster_62:0.01681254946053742 - cluster/prob_snapshot/cluster_63:0.008995718695419026
[36m(TaskRunner pid=2823680)[0m Training Progress:  21%|██        | 168/800 [5:16:40<21:25:51, 122.08s/it]
[36m(TaskRunner pid=2823680)[0m step:168 - global_seqlen/min:323520 - global_seqlen/max:445209 - global_seqlen/minmax_diff:121689 - global_seqlen/balanced_min:385811 - global_seqlen/balanced_max:385898 - global_seqlen/mean:385847.75 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.24028962017968297) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013527791015803814 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0902013034792617) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00046201546105294254) - actor/ppo_kl:np.float64(9.703288970499102e-05) - actor/pg_clipfrac_lower:np.float64(2.2609954066865614e-06) - actor/grad_norm:np.float64(0.24480633666882148) - perf/mfu/actor:np.float64(0.19985095733103336) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(260.1754627227783) - actor/lr:np.float64(1e-06) - training/global_step:168 - training/epoch:0 - critic/score/mean:0.4898989796638489 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.478574275970459 - critic/rewards/max:1.008942723274231 - critic/rewards/min:-0.05170314759016037 - critic/advantages/mean:-0.12719956040382385 - critic/advantages/max:2.4748308658599854 - critic/advantages/min:-2.474855899810791 - critic/returns/mean:-0.12719956040382385 - critic/returns/max:2.4748308658599854 - critic/returns/min:-2.474855899810791 - response_length/mean:1144.29541015625 - response_length/max:8192.0 - response_length/min:118.0 - response_length/clip_ratio:0.010101010091602802 - response_length_non_aborted/mean:1144.29541015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:118.0 - response_length_non_aborted/clip_ratio:0.010101010091602802 - response/aborted_ratio:0.0 - prompt_length/mean:239.92929077148438 - prompt_length/max:399.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.363051503896713e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9877771753817797) - timing_s/agent_loop/generate_sequences/max:np.float64(29.276744023896754) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.12519326597976) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.276744023896754) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.223511868156493 - timing_s/reward:0.00017299316823482513 - timing_s/old_log_prob:11.051900753751397 - timing_s/ref:22.84343466255814 - timing_s/adv:0.10871780943125486 - timing_s/update_actor:22.603405356407166 - timing_s/update_weights:33.82193350140005 - timing_s/step:122.12382243014872 - timing_s/stop_profile:5.7872384786605835e-05 - timing_per_token_ms/adv:9.916739435089735e-05 - timing_per_token_ms/update_actor:0.02061778860683711 - timing_per_token_ms/gen:0.034452313814195244 - timing_per_token_ms/ref:0.020836732319770335 - perf/total_num_tokens:1543391 - perf/time_per_step:122.12382243014872 - perf/throughput:3159.479799452672 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:409.0 - frontier/mean_score:2.338237447223809 - frontier/mean_frontier_pct:0.1718459731238195 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.53 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.0412562999999992 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.7598999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:3.89579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.0569999999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.5509999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.6569999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.3629999999999995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.7598999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.637 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.6338999999999997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.6569999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.9121299999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.7598999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.211645699999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.51 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.91 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.1880699999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:2.7598999999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.51 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7815089999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9717524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:168.0 - cluster/prob_snapshot/cluster_0:0.023963240387871027 - cluster/prob_snapshot/cluster_1:0.013856973204010244 - cluster/prob_snapshot/cluster_2:0.00911689287277926 - cluster/prob_snapshot/cluster_3:0.017039017952848803 - cluster/prob_snapshot/cluster_4:0.013576906735337693 - cluster/prob_snapshot/cluster_5:0.012002460745774256 - cluster/prob_snapshot/cluster_6:0.010114795517826581 - cluster/prob_snapshot/cluster_7:0.018735452449429247 - cluster/prob_snapshot/cluster_8:0.013963848577294814 - cluster/prob_snapshot/cluster_9:0.00911689287277926 - cluster/prob_snapshot/cluster_10:0.026446388745230615 - cluster/prob_snapshot/cluster_11:0.014187867538427887 - cluster/prob_snapshot/cluster_12:0.013963848577294814 - cluster/prob_snapshot/cluster_13:0.015613442745638346 - cluster/prob_snapshot/cluster_14:0.017039017952848803 - cluster/prob_snapshot/cluster_15:0.02410579790859207 - cluster/prob_snapshot/cluster_16:0.018036920597896123 - cluster/prob_snapshot/cluster_17:0.015613442745638346 - cluster/prob_snapshot/cluster_18:0.011540370725037039 - cluster/prob_snapshot/cluster_19:0.018036920597896123 - cluster/prob_snapshot/cluster_20:0.015613442745638346 - cluster/prob_snapshot/cluster_21:0.01604111530780148 - cluster/prob_snapshot/cluster_22:0.00911689287277926 - cluster/prob_snapshot/cluster_23:0.018036920597896123 - cluster/prob_snapshot/cluster_24:0.018735452449429247 - cluster/prob_snapshot/cluster_25:0.011540370725037039 - cluster/prob_snapshot/cluster_26:0.0111126981628739 - cluster/prob_snapshot/cluster_27:0.01788010732510297 - cluster/prob_snapshot/cluster_28:0.018036920597896123 - cluster/prob_snapshot/cluster_29:0.01604111530780148 - cluster/prob_snapshot/cluster_30:0.013963848577294814 - cluster/prob_snapshot/cluster_31:0.019806301777829525 - cluster/prob_snapshot/cluster_32:0.012980405337920631 - cluster/prob_snapshot/cluster_33:0.013576906735337693 - cluster/prob_snapshot/cluster_34:0.01518577018347521 - cluster/prob_snapshot/cluster_35:0.020888071012317037 - cluster/prob_snapshot/cluster_36:0.015613442745638346 - cluster/prob_snapshot/cluster_37:0.017039017952848803 - cluster/prob_snapshot/cluster_38:0.020132516152495495 - cluster/prob_snapshot/cluster_39:0.018735452449429247 - cluster/prob_snapshot/cluster_40:0.019686514766239654 - cluster/prob_snapshot/cluster_41:0.021802107067924164 - cluster/prob_snapshot/cluster_42:0.010114795517826581 - cluster/prob_snapshot/cluster_43:0.017039017952848803 - cluster/prob_snapshot/cluster_44:0.017039017952848803 - cluster/prob_snapshot/cluster_45:0.012965945932247496 - cluster/prob_snapshot/cluster_46:0.013576906735337693 - cluster/prob_snapshot/cluster_47:0.015613442745638346 - cluster/prob_snapshot/cluster_48:0.012965945932247496 - cluster/prob_snapshot/cluster_49:0.018036920597896123 - cluster/prob_snapshot/cluster_50:0.00911689287277926 - cluster/prob_snapshot/cluster_51:0.012965945932247496 - cluster/prob_snapshot/cluster_52:0.014662380428827938 - cluster/prob_snapshot/cluster_53:0.01485361116019517 - cluster/prob_snapshot/cluster_54:0.018735452449429247 - cluster/prob_snapshot/cluster_55:0.017039017952848803 - cluster/prob_snapshot/cluster_56:0.018882144138251202 - cluster/prob_snapshot/cluster_57:0.018036920597896123 - cluster/prob_snapshot/cluster_58:0.020202369337648806 - cluster/prob_snapshot/cluster_59:0.020173603097470818 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014187867538427887 - cluster/prob_snapshot/cluster_62:0.017039017952848803 - cluster/prob_snapshot/cluster_63:0.00911689287277926
[36m(TaskRunner pid=2823680)[0m Training Progress:  21%|██        | 169/800 [5:18:50<21:49:17, 124.50s/it]
[36m(TaskRunner pid=2823680)[0m step:169 - global_seqlen/min:358245 - global_seqlen/max:426061 - global_seqlen/minmax_diff:67816 - global_seqlen/balanced_min:389956 - global_seqlen/balanced_max:390054 - global_seqlen/mean:389998.0 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.21817924758913565) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012263678945600986 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.01524650132341776) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002966215814346785) - actor/ppo_kl:np.float64(6.322980521867883e-05) - actor/pg_clipfrac_lower:np.float64(1.6838975152124328e-07) - actor/grad_norm:np.float64(0.2301380427984091) - perf/mfu/actor:np.float64(0.19692228418868332) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(263.6359062194824) - actor/lr:np.float64(1e-06) - training/global_step:169 - training/epoch:0 - critic/score/mean:0.5876288414001465 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5772047638893127 - critic/rewards/max:1.0205912590026855 - critic/rewards/min:-0.06589192897081375 - critic/advantages/mean:-0.1769401878118515 - critic/advantages/max:2.474778890609741 - critic/advantages/min:-2.4748036861419678 - critic/returns/mean:-0.1769401878118515 - critic/returns/max:2.474778890609741 - critic/returns/min:-2.4748036861419678 - response_length/mean:1182.3634033203125 - response_length/max:8192.0 - response_length/min:162.0 - response_length/clip_ratio:0.01417525764554739 - response_length_non_aborted/mean:1182.3634033203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:162.0 - response_length_non_aborted/clip_ratio:0.01417525764554739 - response/aborted_ratio:0.0 - prompt_length/mean:234.6597900390625 - prompt_length/max:404.0 - prompt_length/min:186.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.719693869352341e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2919243453070521) - timing_s/agent_loop/generate_sequences/max:np.float64(29.601729773916304) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.103924701156757) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.601729773916304) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:209 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.644405876286328 - timing_s/reward:0.00029244739562273026 - timing_s/old_log_prob:10.924953204579651 - timing_s/ref:24.964995781891048 - timing_s/adv:0.08029760234057903 - timing_s/update_actor:23.271117687225342 - timing_s/update_weights:38.55965371802449 - timing_s/step:129.8743566935882 - timing_s/stop_profile:6.949622184038162e-05 - timing_per_token_ms/adv:7.302371053426127e-05 - timing_per_token_ms/update_actor:0.021163064802271116 - timing_per_token_ms/gen:0.03448928940189068 - timing_per_token_ms/ref:0.02270350013358468 - perf/total_num_tokens:1559992 - perf/time_per_step:129.8743566935882 - perf/throughput:3002.8868664205975 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:440.0 - frontier/mean_score:2.3679752743666667 - frontier/mean_frontier_pct:0.18410059036798948 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.9709999999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.328879409999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.6569999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.49 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.7598999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.343 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:3.89579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:1.7398999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.5509999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.7598999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:1.7 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.5540999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.637 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.6338999999999997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.7598999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.9121299999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.7598999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.211645699999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.51 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.237 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.7598999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.8400999999999998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.1880699999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.2319299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.6569999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7815089999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9717524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:169.0 - cluster/prob_snapshot/cluster_0:0.026618413931118582 - cluster/prob_snapshot/cluster_1:0.015610948408723043 - cluster/prob_snapshot/cluster_2:0.009002399876477527 - cluster/prob_snapshot/cluster_3:0.017810406903798055 - cluster/prob_snapshot/cluster_4:0.013406403390137791 - cluster/prob_snapshot/cluster_5:0.01185172982100046 - cluster/prob_snapshot/cluster_6:0.009987770525652655 - cluster/prob_snapshot/cluster_7:0.018500166358220643 - cluster/prob_snapshot/cluster_8:0.013788485886756715 - cluster/prob_snapshot/cluster_9:0.009002399876477527 - cluster/prob_snapshot/cluster_10:0.026114266131632452 - cluster/prob_snapshot/cluster_11:0.014009691542693992 - cluster/prob_snapshot/cluster_12:0.01166290062925037 - cluster/prob_snapshot/cluster_13:0.015417363898658458 - cluster/prob_snapshot/cluster_14:0.01682503625462293 - cluster/prob_snapshot/cluster_15:0.023803069219189645 - cluster/prob_snapshot/cluster_16:0.018500166358220643 - cluster/prob_snapshot/cluster_17:0.01682503625462293 - cluster/prob_snapshot/cluster_18:0.011395442881617122 - cluster/prob_snapshot/cluster_19:0.017810406903798055 - cluster/prob_snapshot/cluster_20:0.015417363898658458 - cluster/prob_snapshot/cluster_21:0.017120647449375463 - cluster/prob_snapshot/cluster_22:0.009002399876477527 - cluster/prob_snapshot/cluster_23:0.017810406903798055 - cluster/prob_snapshot/cluster_24:0.018982997976316453 - cluster/prob_snapshot/cluster_25:0.009987770525652655 - cluster/prob_snapshot/cluster_26:0.010973141174827783 - cluster/prob_snapshot/cluster_27:0.01765556294464196 - cluster/prob_snapshot/cluster_28:0.018500166358220643 - cluster/prob_snapshot/cluster_29:0.0158396656054478 - cluster/prob_snapshot/cluster_30:0.013788485886756715 - cluster/prob_snapshot/cluster_31:0.019557567601850474 - cluster/prob_snapshot/cluster_32:0.012817393057192088 - cluster/prob_snapshot/cluster_33:0.015417363898658458 - cluster/prob_snapshot/cluster_34:0.014995062191869121 - cluster/prob_snapshot/cluster_35:0.020625751615726988 - cluster/prob_snapshot/cluster_36:0.015417363898658458 - cluster/prob_snapshot/cluster_37:0.01682503625462293 - cluster/prob_snapshot/cluster_38:0.019879685267065822 - cluster/prob_snapshot/cluster_39:0.018500166358220643 - cluster/prob_snapshot/cluster_40:0.0194392849156998 - cluster/prob_snapshot/cluster_41:0.021528308900200726 - cluster/prob_snapshot/cluster_42:0.009987770525652655 - cluster/prob_snapshot/cluster_43:0.01682503625462293 - cluster/prob_snapshot/cluster_44:0.01682503625462293 - cluster/prob_snapshot/cluster_45:0.014995062191869121 - cluster/prob_snapshot/cluster_46:0.013406403390137791 - cluster/prob_snapshot/cluster_47:0.015417363898658458 - cluster/prob_snapshot/cluster_48:0.01280311523758159 - cluster/prob_snapshot/cluster_49:0.018500166358220643 - cluster/prob_snapshot/cluster_50:0.012334561439096274 - cluster/prob_snapshot/cluster_51:0.01280311523758159 - cluster/prob_snapshot/cluster_52:0.014478245341179304 - cluster/prob_snapshot/cluster_53:0.014667074532929397 - cluster/prob_snapshot/cluster_54:0.014961076959275116 - cluster/prob_snapshot/cluster_55:0.017810406903798055 - cluster/prob_snapshot/cluster_56:0.018645015843649384 - cluster/prob_snapshot/cluster_57:0.017810406903798055 - cluster/prob_snapshot/cluster_58:0.01994866121250808 - cluster/prob_snapshot/cluster_59:0.019920256228415503 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014009691542693992 - cluster/prob_snapshot/cluster_62:0.01682503625462293 - cluster/prob_snapshot/cluster_63:0.009002399876477527
[36m(TaskRunner pid=2823680)[0m Training Progress:  21%|██▏       | 170/800 [5:21:10<22:34:52, 129.04s/it]
[36m(TaskRunner pid=2823680)[0m step:170 - global_seqlen/min:357818 - global_seqlen/max:477154 - global_seqlen/minmax_diff:119336 - global_seqlen/balanced_min:426761 - global_seqlen/balanced_max:426855 - global_seqlen/mean:426809.0 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.18912988128338723) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011264670640230179 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04156601906288415) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003394151585683737) - actor/ppo_kl:np.float64(6.192198362136782e-05) - actor/pg_clipfrac_lower:np.float64(1.9186080998166444e-07) - actor/grad_norm:np.float64(0.2263230656584104) - perf/mfu/actor:np.float64(0.24097936260230002) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(120.37183666229248) - actor/lr:np.float64(1e-06) - training/global_step:170 - training/epoch:0 - critic/score/mean:0.5268816947937012 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5151232481002808 - critic/rewards/max:1.0050065517425537 - critic/rewards/min:-0.05422501266002655 - critic/advantages/mean:-0.1358533799648285 - critic/advantages/max:2.474858522415161 - critic/advantages/min:-2.4748311042785645 - critic/returns/mean:-0.1358533799648285 - critic/returns/max:2.474858522415161 - critic/returns/min:-2.4748311042785645 - response_length/mean:1357.75 - response_length/max:8192.0 - response_length/min:269.0 - response_length/clip_ratio:0.02016128972172737 - response_length_non_aborted/mean:1357.75 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:269.0 - response_length_non_aborted/clip_ratio:0.02016128972172737 - response/aborted_ratio:0.0 - prompt_length/mean:236.25807189941406 - prompt_length/max:357.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00011362694203853607 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.154580052010715) - timing_s/agent_loop/generate_sequences/max:np.float64(31.27102138940245) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.130878355238565) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.27102138940245) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:222 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.152282094582915 - timing_s/reward:0.00013103708624839783 - timing_s/old_log_prob:11.791507882066071 - timing_s/ref:38.946054195053875 - timing_s/adv:0.08710962440818548 - timing_s/update_actor:20.937590297311544 - timing_s/update_weights:33.37589561380446 - timing_s/step:138.78133601881564 - timing_s/stop_profile:5.6724995374679565e-05 - timing_per_token_ms/adv:7.345184200254775e-05 - timing_per_token_ms/update_actor:0.017654818108568164 - timing_per_token_ms/gen:0.03281864772184266 - timing_per_token_ms/ref:0.03283976298592501 - perf/total_num_tokens:1707236 - perf/time_per_step:138.78133601881564 - perf/throughput:3075.406335201545 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:475.0 - frontier/mean_score:2.396209083890476 - frontier/mean_frontier_pct:0.19721117134167127 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.9709999999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.328879409999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.6569999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.8319299999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.8400999999999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:3.89579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:1.7398999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.3856999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.2319299999999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.69 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.51 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.5540999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.4459 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.6338999999999997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.7598999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.9121299999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.51 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.7598999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.211645699999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.9429999999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.6569999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.6569999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.4659 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.7598999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.5880699999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.1880699999999997 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.2319299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.6569999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7815089999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9717524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:170.0 - cluster/prob_snapshot/cluster_0:0.02630477718138349 - cluster/prob_snapshot/cluster_1:0.015427009308073992 - cluster/prob_snapshot/cluster_2:0.008896327311659034 - cluster/prob_snapshot/cluster_3:0.01760055224652126 - cluster/prob_snapshot/cluster_4:0.013248439779090149 - cluster/prob_snapshot/cluster_5:0.011712084460107958 - cluster/prob_snapshot/cluster_6:0.008896327311659034 - cluster/prob_snapshot/cluster_7:0.01875932703179938 - cluster/prob_snapshot/cluster_8:0.013626020312794215 - cluster/prob_snapshot/cluster_9:0.012189227018751891 - cluster/prob_snapshot/cluster_10:0.025806569603490804 - cluster/prob_snapshot/cluster_11:0.013844619569149204 - cluster/prob_snapshot/cluster_12:0.011525480185819472 - cluster/prob_snapshot/cluster_13:0.015235705745953669 - cluster/prob_snapshot/cluster_14:0.016626791922758136 - cluster/prob_snapshot/cluster_15:0.022427621280032756 - cluster/prob_snapshot/cluster_16:0.014784795098072333 - cluster/prob_snapshot/cluster_17:0.016626791922758136 - cluster/prob_snapshot/cluster_18:0.01781915150287625 - cluster/prob_snapshot/cluster_19:0.01760055224652126 - cluster/prob_snapshot/cluster_20:0.016626791922758136 - cluster/prob_snapshot/cluster_21:0.01691892001988707 - cluster/prob_snapshot/cluster_22:0.008214695085024846 - cluster/prob_snapshot/cluster_23:0.01760055224652126 - cluster/prob_snapshot/cluster_24:0.01875932703179938 - cluster/prob_snapshot/cluster_25:0.00987008763542216 - cluster/prob_snapshot/cluster_26:0.009577959538293223 - cluster/prob_snapshot/cluster_27:0.017447532767072768 - cluster/prob_snapshot/cluster_28:0.01828218447315545 - cluster/prob_snapshot/cluster_29:0.015653031598995008 - cluster/prob_snapshot/cluster_30:0.013626020312794215 - cluster/prob_snapshot/cluster_31:0.01932712667658566 - cluster/prob_snapshot/cluster_32:0.012666369577395822 - cluster/prob_snapshot/cluster_33:0.016626791922758136 - cluster/prob_snapshot/cluster_34:0.014818379892912332 - cluster/prob_snapshot/cluster_35:0.02038272460013019 - cluster/prob_snapshot/cluster_36:0.015235705745953669 - cluster/prob_snapshot/cluster_37:0.016626791922758136 - cluster/prob_snapshot/cluster_38:0.019645448926423822 - cluster/prob_snapshot/cluster_39:0.01828218447315545 - cluster/prob_snapshot/cluster_40:0.019210237679680716 - cluster/prob_snapshot/cluster_41:0.021274647324111908 - cluster/prob_snapshot/cluster_42:0.012870859245386078 - cluster/prob_snapshot/cluster_43:0.01760055224652126 - cluster/prob_snapshot/cluster_44:0.01760055224652126 - cluster/prob_snapshot/cluster_45:0.0163346638256292 - cluster/prob_snapshot/cluster_46:0.013248439779090149 - cluster/prob_snapshot/cluster_47:0.016626791922758136 - cluster/prob_snapshot/cluster_48:0.012652259989031092 - cluster/prob_snapshot/cluster_49:0.01828218447315545 - cluster/prob_snapshot/cluster_50:0.010519724879989846 - cluster/prob_snapshot/cluster_51:0.012652259989031092 - cluster/prob_snapshot/cluster_52:0.014307652539428402 - cluster/prob_snapshot/cluster_53:0.01449425681371689 - cluster/prob_snapshot/cluster_54:0.014784795098072333 - cluster/prob_snapshot/cluster_55:0.01760055224652126 - cluster/prob_snapshot/cluster_56:0.018425327240748624 - cluster/prob_snapshot/cluster_57:0.01760055224652126 - cluster/prob_snapshot/cluster_58:0.01971361214908724 - cluster/prob_snapshot/cluster_59:0.01968554185236222 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013844619569149204 - cluster/prob_snapshot/cluster_62:0.016626791922758136 - cluster/prob_snapshot/cluster_63:0.008896327311659034
[36m(TaskRunner pid=2823680)[0m Training Progress:  21%|██▏       | 171/800 [5:23:04<21:46:46, 124.65s/it]
[36m(TaskRunner pid=2823680)[0m step:171 - global_seqlen/min:306883 - global_seqlen/max:447740 - global_seqlen/minmax_diff:140857 - global_seqlen/balanced_min:379158 - global_seqlen/balanced_max:379386 - global_seqlen/mean:379291.0 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.18031871945177222) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013230622746050358 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04491132029943401) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0002969415908962986) - actor/ppo_kl:np.float64(0.00011433817691313914) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.2296741914290648) - perf/mfu/actor:np.float64(0.18430361743024826) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(126.38438034057617) - actor/lr:np.float64(1e-06) - training/global_step:171 - training/epoch:0 - critic/score/mean:0.6164215803146362 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6053299903869629 - critic/rewards/max:1.0051982402801514 - critic/rewards/min:-0.06171702593564987 - critic/advantages/mean:-0.2090798020362854 - critic/advantages/max:2.47483229637146 - critic/advantages/min:-2.474846124649048 - critic/returns/mean:-0.2090798020362854 - critic/returns/max:2.47483229637146 - critic/returns/min:-2.474846124649048 - response_length/mean:1144.36279296875 - response_length/max:8192.0 - response_length/min:187.0 - response_length/clip_ratio:0.013480392284691334 - response_length_non_aborted/mean:1144.36279296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:187.0 - response_length_non_aborted/clip_ratio:0.013480392284691334 - response/aborted_ratio:0.0 - prompt_length/mean:227.9705810546875 - prompt_length/max:321.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.532218635082245e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4951510531827807) - timing_s/agent_loop/generate_sequences/max:np.float64(28.847492484375834) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.719966598911014) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.847492484375834) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:194 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.57813427131623 - timing_s/reward:0.0006743436679244041 - timing_s/old_log_prob:11.052465282380581 - timing_s/ref:20.0316872196272 - timing_s/adv:0.10467082262039185 - timing_s/update_actor:24.180923387408257 - timing_s/update_weights:27.722379761748016 - timing_s/step:114.08929086662829 - timing_s/stop_profile:6.832368671894073e-05 - timing_per_token_ms/adv:9.347077989076126e-05 - timing_per_token_ms/update_actor:0.021593503432153854 - timing_per_token_ms/gen:0.0327459137623862 - timing_per_token_ms/ref:0.01788824602761434 - perf/total_num_tokens:1517164 - perf/time_per_step:114.08929086662829 - perf/throughput:3324.510101858689 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:501.0 - frontier/mean_score:2.4343967387709515 - frontier/mean_frontier_pct:0.2133739407069543 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.6796999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.328879409999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:2.7598999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.8319299999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.8400999999999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:3.89579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.1179299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.3856999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.2319299999999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.69 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.51 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:3.28787 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:2.8319299999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.4459 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.7437299999999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.9121299999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.0569999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.4659 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.7598999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.211645699999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.2600999999999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.6569999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.6569999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.4659 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.6569999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:1.411649 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:3.031649 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.2319299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.7598999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7815089999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.9759899999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9802267325699994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:171.0 - cluster/prob_snapshot/cluster_0:0.023992776353054426 - cluster/prob_snapshot/cluster_1:0.015185010418611117 - cluster/prob_snapshot/cluster_2:0.008756773281015326 - cluster/prob_snapshot/cluster_3:0.017995397303257034 - cluster/prob_snapshot/cluster_4:0.013040615459442035 - cluster/prob_snapshot/cluster_5:0.011528360487687838 - cluster/prob_snapshot/cluster_6:0.008756773281015326 - cluster/prob_snapshot/cluster_7:0.018465055069028836 - cluster/prob_snapshot/cluster_8:0.01341227300003613 - cluster/prob_snapshot/cluster_9:0.011998018253459643 - cluster/prob_snapshot/cluster_10:0.025401749650369843 - cluster/prob_snapshot/cluster_11:0.013627443155116926 - cluster/prob_snapshot/cluster_12:0.01380955535000803 - cluster/prob_snapshot/cluster_13:0.016365972401599753 - cluster/prob_snapshot/cluster_14:0.016365972401599753 - cluster/prob_snapshot/cluster_15:0.022075805880516445 - cluster/prob_snapshot/cluster_16:0.014552870431196227 - cluster/prob_snapshot/cluster_17:0.016365972401599753 - cluster/prob_snapshot/cluster_18:0.017539627792949538 - cluster/prob_snapshot/cluster_19:0.017324457637868742 - cluster/prob_snapshot/cluster_20:0.016365972401599753 - cluster/prob_snapshot/cluster_21:0.02143792417531784 - cluster/prob_snapshot/cluster_22:0.008085833615627034 - cluster/prob_snapshot/cluster_23:0.017324457637868742 - cluster/prob_snapshot/cluster_24:0.018465055069028836 - cluster/prob_snapshot/cluster_25:0.009715258517284317 - cluster/prob_snapshot/cluster_26:0.00942771294640362 - cluster/prob_snapshot/cluster_27:0.01788996392726744 - cluster/prob_snapshot/cluster_28:0.018465055069028836 - cluster/prob_snapshot/cluster_29:0.015407487165330762 - cluster/prob_snapshot/cluster_30:0.01341227300003613 - cluster/prob_snapshot/cluster_31:0.019023947810297287 - cluster/prob_snapshot/cluster_32:0.012467676019231448 - cluster/prob_snapshot/cluster_33:0.01341227300003613 - cluster/prob_snapshot/cluster_34:0.016078426830719056 - cluster/prob_snapshot/cluster_35:0.02006298688435157 - cluster/prob_snapshot/cluster_36:0.01499670777835834 - cluster/prob_snapshot/cluster_37:0.016365972401599753 - cluster/prob_snapshot/cluster_38:0.019337276634033618 - cluster/prob_snapshot/cluster_39:0.017995397303257034 - cluster/prob_snapshot/cluster_40:0.01890889241619095 - cluster/prob_snapshot/cluster_41:0.02094091828283526 - cluster/prob_snapshot/cluster_42:0.014736547499942469 - cluster/prob_snapshot/cluster_43:0.017324457637868742 - cluster/prob_snapshot/cluster_44:0.017324457637868742 - cluster/prob_snapshot/cluster_45:0.016078426830719056 - cluster/prob_snapshot/cluster_46:0.013040615459442035 - cluster/prob_snapshot/cluster_47:0.017324457637868742 - cluster/prob_snapshot/cluster_48:0.012453787763767142 - cluster/prob_snapshot/cluster_49:0.018465055069028836 - cluster/prob_snapshot/cluster_50:0.009204385886352945 - cluster/prob_snapshot/cluster_51:0.012453787763767142 - cluster/prob_snapshot/cluster_52:0.014083212665424423 - cluster/prob_snapshot/cluster_53:0.019767284408500993 - cluster/prob_snapshot/cluster_54:0.014552870431196227 - cluster/prob_snapshot/cluster_55:0.017995397303257034 - cluster/prob_snapshot/cluster_56:0.018136294632988572 - cluster/prob_snapshot/cluster_57:0.017324457637868742 - cluster/prob_snapshot/cluster_58:0.019404370600572447 - cluster/prob_snapshot/cluster_59:0.01943199540069738 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013627443155116926 - cluster/prob_snapshot/cluster_62:0.016365972401599753 - cluster/prob_snapshot/cluster_63:0.008756773281015326
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 16:55:31,435:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▏       | 172/800 [5:24:51<20:47:56, 119.23s/it]
[36m(TaskRunner pid=2823680)[0m step:172 - global_seqlen/min:340095 - global_seqlen/max:406478 - global_seqlen/minmax_diff:66383 - global_seqlen/balanced_min:374367 - global_seqlen/balanced_max:374697 - global_seqlen/mean:374555.25 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.18104515846748398) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011799103580415249 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.01897366315824911) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003719755815219064) - actor/ppo_kl:np.float64(4.3758730174766174e-05) - actor/pg_clipfrac_lower:np.float64(7.041612404446944e-06) - actor/grad_norm:np.float64(0.23611155152320862) - perf/mfu/actor:np.float64(0.17628800683621396) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(138.08395385742188) - actor/lr:np.float64(1e-06) - training/global_step:172 - training/epoch:0 - critic/score/mean:0.598809540271759 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5879634022712708 - critic/rewards/max:1.0064321756362915 - critic/rewards/min:-0.11891467869281769 - critic/advantages/mean:-0.17159700393676758 - critic/advantages/max:2.4748427867889404 - critic/advantages/min:-2.474839448928833 - critic/returns/mean:-0.17159700393676758 - critic/returns/max:2.4748427867889404 - critic/returns/min:-2.474839448928833 - response_length/mean:1165.7940673828125 - response_length/max:8192.0 - response_length/min:218.0 - response_length/clip_ratio:0.013095238246023655 - response_length_non_aborted/mean:1165.7940673828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:218.0 - response_length_non_aborted/clip_ratio:0.013095238246023655 - response/aborted_ratio:0.0 - prompt_length/mean:235.77142333984375 - prompt_length/max:522.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.987875819206238e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7136483592912555) - timing_s/agent_loop/generate_sequences/max:np.float64(28.11301131732762) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.73347244521301) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.11301131732762) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:286 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.29903475381434 - timing_s/reward:0.00032812729477882385 - timing_s/old_log_prob:11.58687790390104 - timing_s/ref:14.415063353255391 - timing_s/adv:0.0803659874945879 - timing_s/update_actor:24.799226864241064 - timing_s/update_weights:24.372320679947734 - timing_s/step:105.95861729979515 - timing_s/stop_profile:6.586313247680664e-05 - timing_per_token_ms/adv:6.826209425225016e-05 - timing_per_token_ms/update_actor:0.021064223987837635 - timing_per_token_ms/gen:0.030940524651412067 - timing_per_token_ms/ref:0.01224401570799267 - perf/total_num_tokens:1498221 - perf/time_per_step:105.95861729979515 - perf/throughput:3534.920137172497 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:524.0 - frontier/mean_score:2.4779105345575805 - frontier/mean_frontier_pct:0.22950055470199165 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4757899999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.328879409999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.8319299999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:1.8400999999999998 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.227053 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.1179299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.3856999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.2319299999999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.69 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.51 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:3.2015089999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.7598999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.4823509999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.4459 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.7437299999999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.9121299999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.339899999999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.4659 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.9656999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.211645699999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.2600999999999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.6569999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.62613 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:1.411649 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:3.0221542999999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.2319299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.7598999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.8470562999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.3831929999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:2.9802267325699994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:172.0 - cluster/prob_snapshot/cluster_0:0.022624355581210485 - cluster/prob_snapshot/cluster_1:0.015158969868029906 - cluster/prob_snapshot/cluster_2:0.008741756419566685 - cluster/prob_snapshot/cluster_3:0.022338865308267675 - cluster/prob_snapshot/cluster_4:0.013018252300173767 - cluster/prob_snapshot/cluster_5:0.011508590672184115 - cluster/prob_snapshot/cluster_6:0.008741756419566685 - cluster/prob_snapshot/cluster_7:0.018433389618215546 - cluster/prob_snapshot/cluster_8:0.013389272490728717 - cluster/prob_snapshot/cluster_9:0.011977443028774874 - cluster/prob_snapshot/cluster_10:0.02751442122010321 - cluster/prob_snapshot/cluster_11:0.013604073653681586 - cluster/prob_snapshot/cluster_12:0.01378587354705351 - cluster/prob_snapshot/cluster_13:0.016337906636718077 - cluster/prob_snapshot/cluster_14:0.016337906636718077 - cluster/prob_snapshot/cluster_15:0.02203794840634916 - cluster/prob_snapshot/cluster_16:0.014527913928163414 - cluster/prob_snapshot/cluster_17:0.016337906636718077 - cluster/prob_snapshot/cluster_18:0.017509549343733718 - cluster/prob_snapshot/cluster_19:0.017294748180780847 - cluster/prob_snapshot/cluster_20:0.016337906636718077 - cluster/prob_snapshot/cluster_21:0.020839025951638508 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.017964537261624787 - cluster/prob_snapshot/cluster_24:0.022667061957881207 - cluster/prob_snapshot/cluster_25:0.008741756419566685 - cluster/prob_snapshot/cluster_26:0.009411545500410625 - cluster/prob_snapshot/cluster_27:0.01785928469177788 - cluster/prob_snapshot/cluster_28:0.018433389618215546 - cluster/prob_snapshot/cluster_29:0.015381065092655303 - cluster/prob_snapshot/cluster_30:0.013389272490728717 - cluster/prob_snapshot/cluster_31:0.018991323922558548 - cluster/prob_snapshot/cluster_32:0.012446295385365633 - cluster/prob_snapshot/cluster_33:0.015230704278588294 - cluster/prob_snapshot/cluster_34:0.016050854173499246 - cluster/prob_snapshot/cluster_35:0.02002858116381734 - cluster/prob_snapshot/cluster_36:0.014970990145199831 - cluster/prob_snapshot/cluster_37:0.016337906636718077 - cluster/prob_snapshot/cluster_38:0.01930411542331267 - cluster/prob_snapshot/cluster_39:0.018433389618215546 - cluster/prob_snapshot/cluster_40:0.018876465835251963 - cluster/prob_snapshot/cluster_41:0.02090500701068409 - cluster/prob_snapshot/cluster_42:0.014711276011811363 - cluster/prob_snapshot/cluster_43:0.017294748180780847 - cluster/prob_snapshot/cluster_44:0.017964537261624787 - cluster/prob_snapshot/cluster_45:0.017093811456527667 - cluster/prob_snapshot/cluster_46:0.013018252300173767 - cluster/prob_snapshot/cluster_47:0.017964537261624787 - cluster/prob_snapshot/cluster_48:0.012432430946665948 - cluster/prob_snapshot/cluster_49:0.018433389618215546 - cluster/prob_snapshot/cluster_50:0.009188601420643998 - cluster/prob_snapshot/cluster_51:0.012432430946665948 - cluster/prob_snapshot/cluster_52:0.014059061571572657 - cluster/prob_snapshot/cluster_53:0.019671583583727517 - cluster/prob_snapshot/cluster_54:0.014527913928163414 - cluster/prob_snapshot/cluster_55:0.017964537261624787 - cluster/prob_snapshot/cluster_56:0.018531848613099602 - cluster/prob_snapshot/cluster_57:0.017294748180780847 - cluster/prob_snapshot/cluster_58:0.015512503877004008 - cluster/prob_snapshot/cluster_59:0.01939867175815937 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013604073653681586 - cluster/prob_snapshot/cluster_62:0.016337906636718077 - cluster/prob_snapshot/cluster_63:0.008741756419566685
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▏       | 173/800 [5:26:50<20:46:37, 119.29s/it]
[36m(TaskRunner pid=2823680)[0m step:173 - global_seqlen/min:338670 - global_seqlen/max:470398 - global_seqlen/minmax_diff:131728 - global_seqlen/balanced_min:375726 - global_seqlen/balanced_max:375832 - global_seqlen/mean:375778.5 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.18140376704589775) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012744324281811714 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.022406538719224045) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006047915838583625) - actor/ppo_kl:np.float64(0.00010324733135380626) - actor/pg_clipfrac_lower:np.float64(4.927933711466418e-07) - actor/grad_norm:np.float64(0.22327874725063643) - perf/mfu/actor:np.float64(0.2036756984023375) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(162.70925426483154) - actor/lr:np.float64(1e-06) - training/global_step:173 - training/epoch:0 - critic/score/mean:0.5736842155456543 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5628092288970947 - critic/rewards/max:1.0016194581985474 - critic/rewards/min:-0.06064445897936821 - critic/advantages/mean:-0.18517015874385834 - critic/advantages/max:2.4747824668884277 - critic/advantages/min:-2.4747860431671143 - critic/returns/mean:-0.18517015874385834 - critic/returns/max:2.4747824668884277 - critic/returns/min:-2.4747860431671143 - response_length/mean:1210.5986328125 - response_length/max:8192.0 - response_length/min:144.0 - response_length/clip_ratio:0.017105262726545334 - response_length_non_aborted/mean:1210.5986328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:144.0 - response_length_non_aborted/clip_ratio:0.017105262726545334 - response/aborted_ratio:0.0 - prompt_length/mean:235.08421325683594 - prompt_length/max:382.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.34237614274025e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2026671692728996) - timing_s/agent_loop/generate_sequences/max:np.float64(28.667999485507607) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.738358793490079) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.667999485507607) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:214 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.343528821133077 - timing_s/reward:0.00018181465566158295 - timing_s/old_log_prob:10.993909990414977 - timing_s/ref:22.0280432542786 - timing_s/adv:0.07750822603702545 - timing_s/update_actor:21.69484654441476 - timing_s/update_weights:33.608996340073645 - timing_s/step:119.18855373095721 - timing_s/stop_profile:5.6665390729904175e-05 - timing_per_token_ms/adv:7.05441755690267e-05 - timing_per_token_ms/update_actor:0.01974558239587625 - timing_per_token_ms/gen:0.03298012490680783 - timing_per_token_ms/ref:0.020048841654944166 - perf/total_num_tokens:1503114 - perf/time_per_step:119.18855373095721 - perf/throughput:3152.8069452729496 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:557.0 - frontier/mean_score:2.4712092553677256 - frontier/mean_frontier_pct:0.24686295754661328 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.3330529999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.328879409999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.8319299999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.1880699999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.227053 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.1179299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.3856999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.2319299999999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:2.183 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:3.2015089999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.7598999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.4823509999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.31213 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.7437299999999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.9121299999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.339899999999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.4659 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.211645699999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.7382909999999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.8823509999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:1.411649 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.0155080099999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.2319299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.8319299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.8470562999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.3831929999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9861587127989995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.0569999999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:173.0 - cluster/prob_snapshot/cluster_0:0.02175409437224658 - cluster/prob_snapshot/cluster_1:0.015200077066497869 - cluster/prob_snapshot/cluster_2:0.008765461797915352 - cluster/prob_snapshot/cluster_3:0.022399442522799427 - cluster/prob_snapshot/cluster_4:0.013053554427275283 - cluster/prob_snapshot/cluster_5:0.011539798988116304 - cluster/prob_snapshot/cluster_6:0.008765461797915352 - cluster/prob_snapshot/cluster_7:0.018483376194616843 - cluster/prob_snapshot/cluster_8:0.013425580728452626 - cluster/prob_snapshot/cluster_9:0.014281045417844113 - cluster/prob_snapshot/cluster_10:0.027589033201238632 - cluster/prob_snapshot/cluster_11:0.013640964376502671 - cluster/prob_snapshot/cluster_12:0.013823257264079567 - cluster/prob_snapshot/cluster_13:0.017341647056635212 - cluster/prob_snapshot/cluster_14:0.01638221080623048 - cluster/prob_snapshot/cluster_15:0.02209770961221296 - cluster/prob_snapshot/cluster_16:0.014567309866434258 - cluster/prob_snapshot/cluster_17:0.01638221080623048 - cluster/prob_snapshot/cluster_18:0.01424795465737097 - cluster/prob_snapshot/cluster_19:0.017341647056635212 - cluster/prob_snapshot/cluster_20:0.017341647056635212 - cluster/prob_snapshot/cluster_21:0.020895535990455832 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.018013252431918526 - cluster/prob_snapshot/cluster_24:0.022728529156688253 - cluster/prob_snapshot/cluster_25:0.008765461797915352 - cluster/prob_snapshot/cluster_26:0.00856398018533036 - cluster/prob_snapshot/cluster_27:0.017907714444374005 - cluster/prob_snapshot/cluster_28:0.018483376194616843 - cluster/prob_snapshot/cluster_29:0.015422774555825745 - cluster/prob_snapshot/cluster_30:0.013425580728452626 - cluster/prob_snapshot/cluster_31:0.019042823472227844 - cluster/prob_snapshot/cluster_32:0.012480046513512944 - cluster/prob_snapshot/cluster_33:0.015272006002190712 - cluster/prob_snapshot/cluster_34:0.01609437993110906 - cluster/prob_snapshot/cluster_35:0.02008289348636302 - cluster/prob_snapshot/cluster_36:0.012466144478047895 - cluster/prob_snapshot/cluster_37:0.01638221080623048 - cluster/prob_snapshot/cluster_38:0.019423623720013484 - cluster/prob_snapshot/cluster_39:0.018483376194616843 - cluster/prob_snapshot/cluster_40:0.01892765391954916 - cluster/prob_snapshot/cluster_41:0.02096169597303731 - cluster/prob_snapshot/cluster_42:0.016199917918653578 - cluster/prob_snapshot/cluster_43:0.018013252431918526 - cluster/prob_snapshot/cluster_44:0.018013252431918526 - cluster/prob_snapshot/cluster_45:0.017872215303109028 - cluster/prob_snapshot/cluster_46:0.013053554427275283 - cluster/prob_snapshot/cluster_47:0.018013252431918526 - cluster/prob_snapshot/cluster_48:0.012466144478047895 - cluster/prob_snapshot/cluster_49:0.018812462828505665 - cluster/prob_snapshot/cluster_50:0.009213518526854363 - cluster/prob_snapshot/cluster_51:0.012466144478047895 - cluster/prob_snapshot/cluster_52:0.01409718610373594 - cluster/prob_snapshot/cluster_53:0.019681548967209788 - cluster/prob_snapshot/cluster_54:0.014567309866434258 - cluster/prob_snapshot/cluster_55:0.018483376194616843 - cluster/prob_snapshot/cluster_56:0.01858210218478349 - cluster/prob_snapshot/cluster_57:0.017341647056635212 - cluster/prob_snapshot/cluster_58:0.01555456976810073 - cluster/prob_snapshot/cluster_59:0.01948999264300202 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013640964376502671 - cluster/prob_snapshot/cluster_62:0.013425580728452626 - cluster/prob_snapshot/cluster_63:0.008765461797915352
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▏       | 174/800 [5:28:43<20:22:18, 117.15s/it]
[36m(TaskRunner pid=2823680)[0m step:174 - global_seqlen/min:318250 - global_seqlen/max:408695 - global_seqlen/minmax_diff:90445 - global_seqlen/balanced_min:353852 - global_seqlen/balanced_max:354122 - global_seqlen/mean:354019.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.17809074426380297) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013917873613536358 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.057204338983865455) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003912200619424766) - actor/ppo_kl:np.float64(1.5078347877543289e-05) - actor/pg_clipfrac_lower:np.float64(3.126438249031101e-07) - actor/grad_norm:np.float64(0.22586852808793387) - perf/mfu/actor:np.float64(0.18989829792737542) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.60724639892578) - actor/lr:np.float64(1e-06) - training/global_step:174 - training/epoch:0 - critic/score/mean:0.66796875 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6569457650184631 - critic/rewards/max:1.0050082206726074 - critic/rewards/min:-0.08467026054859161 - critic/advantages/mean:-0.18426238000392914 - critic/advantages/max:2.4748077392578125 - critic/advantages/min:-2.4748244285583496 - critic/returns/mean:-0.18426238000392914 - critic/returns/max:2.4748077392578125 - critic/returns/min:-2.4748244285583496 - response_length/mean:1071.359375 - response_length/max:8192.0 - response_length/min:175.0 - response_length/clip_ratio:0.01171875 - response_length_non_aborted/mean:1071.359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:175.0 - response_length_non_aborted/clip_ratio:0.01171875 - response/aborted_ratio:0.0 - prompt_length/mean:225.6458282470703 - prompt_length/max:341.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.763838559389114e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5000801626592875) - timing_s/agent_loop/generate_sequences/max:np.float64(27.70155424810946) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.357321482784755) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.70155424810946) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:224 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.924644670449197 - timing_s/reward:0.0001492183655500412 - timing_s/old_log_prob:11.46273535490036 - timing_s/ref:21.586494223214686 - timing_s/adv:0.07521969452500343 - timing_s/update_actor:21.855721867643297 - timing_s/update_weights:26.565980567596853 - timing_s/step:111.89134069532156 - timing_s/stop_profile:7.914938032627106e-05 - timing_per_token_ms/adv:7.55141999046315e-05 - timing_per_token_ms/update_actor:0.021941292909992267 - timing_per_token_ms/gen:0.036369104513892976 - timing_per_token_ms/ref:0.02167101116676507 - perf/total_num_tokens:1416077 - perf/time_per_step:111.89134069532156 - perf/throughput:3163.955743134664 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:589.0 - frontier/mean_score:2.524764532933758 - frontier/mean_frontier_pct:0.25929600827610566 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.8331370999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.328879409999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.8319299999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.1880699999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.4589371 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.3825509999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.3856999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.2319299999999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:2.183 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:3.3598999999999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:3.2015089999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.8319299999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.4823509999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.7437299999999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:1.9121299999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.339899999999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.62613 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.6569999999999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.1481519899999992 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.8319299999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.8168036999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:2.8823509999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:1.411649 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.0155080099999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.2319299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.8319299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.8470562999999993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9682350999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9903110989592996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.0569999999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:174.0 - cluster/prob_snapshot/cluster_0:0.024888781625008186 - cluster/prob_snapshot/cluster_1:0.015121549152642593 - cluster/prob_snapshot/cluster_2:0.008720176933505979 - cluster/prob_snapshot/cluster_3:0.02228372064289439 - cluster/prob_snapshot/cluster_4:0.012986116058832433 - cluster/prob_snapshot/cluster_5:0.011480181110069928 - cluster/prob_snapshot/cluster_6:0.008720176933505979 - cluster/prob_snapshot/cluster_7:0.01838788582524466 - cluster/prob_snapshot/cluster_8:0.013356220366509153 - cluster/prob_snapshot/cluster_9:0.014207265482424737 - cluster/prob_snapshot/cluster_10:0.028952137339816858 - cluster/prob_snapshot/cluster_11:0.01357049128147989 - cluster/prob_snapshot/cluster_12:0.01547004190104363 - cluster/prob_snapshot/cluster_13:0.017920190855385813 - cluster/prob_snapshot/cluster_14:0.017252055184158885 - cluster/prob_snapshot/cluster_15:0.021983546570194478 - cluster/prob_snapshot/cluster_16:0.01449205100759493 - cluster/prob_snapshot/cluster_17:0.0162975756538347 - cluster/prob_snapshot/cluster_18:0.014174345678215598 - cluster/prob_snapshot/cluster_19:0.017252055184158885 - cluster/prob_snapshot/cluster_20:0.021816025673035543 - cluster/prob_snapshot/cluster_21:0.02078758371869828 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01838788582524466 - cluster/prob_snapshot/cluster_24:0.022611107121795586 - cluster/prob_snapshot/cluster_25:0.008720176933505979 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01781519810705015 - cluster/prob_snapshot/cluster_28:0.01838788582524466 - cluster/prob_snapshot/cluster_29:0.015343096123510515 - cluster/prob_snapshot/cluster_30:0.013356220366509153 - cluster/prob_snapshot/cluster_31:0.018944442839376692 - cluster/prob_snapshot/cluster_32:0.012415571049787629 - cluster/prob_snapshot/cluster_33:0.015193106483030999 - cluster/prob_snapshot/cluster_34:0.017051614482790807 - cluster/prob_snapshot/cluster_35:0.019979139556513693 - cluster/prob_snapshot/cluster_36:0.012401740836184972 - cluster/prob_snapshot/cluster_37:0.017252055184158885 - cluster/prob_snapshot/cluster_38:0.01932327576496236 - cluster/prob_snapshot/cluster_39:0.01838788582524466 - cluster/prob_snapshot/cluster_40:0.018829868285307028 - cluster/prob_snapshot/cluster_41:0.020441133556492134 - cluster/prob_snapshot/cluster_42:0.016116224543073102 - cluster/prob_snapshot/cluster_43:0.01838788582524466 - cluster/prob_snapshot/cluster_44:0.017920190855385813 - cluster/prob_snapshot/cluster_45:0.0182896698815743 - cluster/prob_snapshot/cluster_46:0.012986116058832433 - cluster/prob_snapshot/cluster_47:0.017920190855385813 - cluster/prob_snapshot/cluster_48:0.012401740836184972 - cluster/prob_snapshot/cluster_49:0.018715272304145856 - cluster/prob_snapshot/cluster_50:0.00916591887416737 - cluster/prob_snapshot/cluster_51:0.012401740836184972 - cluster/prob_snapshot/cluster_52:0.014024356037736082 - cluster/prob_snapshot/cluster_53:0.01957986849709941 - cluster/prob_snapshot/cluster_54:0.01449205100759493 - cluster/prob_snapshot/cluster_55:0.01838788582524466 - cluster/prob_snapshot/cluster_56:0.01848610176891502 - cluster/prob_snapshot/cluster_57:0.017920190855385813 - cluster/prob_snapshot/cluster_58:0.012779864719833826 - cluster/prob_snapshot/cluster_59:0.01941626349155011 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01357049128147989 - cluster/prob_snapshot/cluster_62:0.013356220366509153 - cluster/prob_snapshot/cluster_63:0.008720176933505979
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 17:01:14,304:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_175
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 175}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▏       | 175/800 [5:33:45<29:57:59, 172.61s/it]
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:175 - global_seqlen/min:379570 - global_seqlen/max:474715 - global_seqlen/minmax_diff:95145 - global_seqlen/balanced_min:409094 - global_seqlen/balanced_max:409173 - global_seqlen/mean:409115.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.21286773091802993) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012664459645748138 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09580057632410899) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004569415845783927) - actor/ppo_kl:np.float64(-4.981987895943026e-05) - actor/pg_clipfrac_lower:np.float64(4.975480957606729e-07) - actor/grad_norm:np.float64(0.23138684406876564) - perf/mfu/actor:np.float64(0.22881514752193413) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(105.8760757446289) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.1) - val-aux/aime2024/reward/std@16:np.float64(0.11773339488556522) - val-aux/aime2024/reward/best@2/mean:np.float64(0.14886666666666667) - val-aux/aime2024/reward/best@2/std:np.float64(0.11691805467363334) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.05023333333333334) - val-aux/aime2024/reward/worst@2/std:np.float64(0.08453444805557388) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.09963333333333332) - val-aux/aime2024/reward/maj@2/std:np.float64(0.11801865320122487) - val-aux/aime2024/reward/best@4/mean:np.float64(0.19556666666666667) - val-aux/aime2024/reward/best@4/std:np.float64(0.09963703312889909) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.0176) - val-aux/aime2024/reward/worst@4/std:np.float64(0.0463895371711608) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.12763333333333332) - val-aux/aime2024/reward/maj@4/std:np.float64(0.10996525109325783) - val-aux/aime2024/reward/best@8/mean:np.float64(0.23296666666666666) - val-aux/aime2024/reward/best@8/std:np.float64(0.07220099768908175) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0028) - val-aux/aime2024/reward/worst@8/std:np.float64(0.016913985403416938) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.15096666666666667) - val-aux/aime2024/reward/maj@8/std:np.float64(0.0906451137054987) - val-aux/aime2024/reward/best@16/mean:np.float64(0.26313333333333333) - val-aux/aime2024/reward/best@16/std:np.float64(0.05530171939039241) - val-aux/aime2024/reward/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/reward/worst@16/std:np.float64(0.001053565375285274) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.1711333333333333) - val-aux/aime2024/reward/maj@16/std:np.float64(0.06627669773850645) - val-aux/aime2024/score/mean@16:np.float64(0.1) - val-aux/aime2024/score/std@16:np.float64(0.11773339488556522) - val-aux/aime2024/score/best@2/mean:np.float64(0.14886666666666667) - val-aux/aime2024/score/best@2/std:np.float64(0.11691805467363334) - val-aux/aime2024/score/worst@2/mean:np.float64(0.05023333333333334) - val-aux/aime2024/score/worst@2/std:np.float64(0.08453444805557388) - val-aux/aime2024/score/maj@2/mean:np.float64(0.09963333333333332) - val-aux/aime2024/score/maj@2/std:np.float64(0.11801865320122487) - val-aux/aime2024/score/best@4/mean:np.float64(0.19556666666666667) - val-aux/aime2024/score/best@4/std:np.float64(0.09963703312889909) - val-aux/aime2024/score/worst@4/mean:np.float64(0.0176) - val-aux/aime2024/score/worst@4/std:np.float64(0.0463895371711608) - val-aux/aime2024/score/maj@4/mean:np.float64(0.12763333333333332) - val-aux/aime2024/score/maj@4/std:np.float64(0.10996525109325783) - val-aux/aime2024/score/best@8/mean:np.float64(0.23296666666666666) - val-aux/aime2024/score/best@8/std:np.float64(0.07220099768908175) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0028) - val-aux/aime2024/score/worst@8/std:np.float64(0.016913985403416938) - val-aux/aime2024/score/maj@8/mean:np.float64(0.15096666666666667) - val-aux/aime2024/score/maj@8/std:np.float64(0.0906451137054987) - val-aux/aime2024/score/best@16/mean:np.float64(0.26313333333333333) - val-aux/aime2024/score/best@16/std:np.float64(0.05530171939039241) - val-aux/aime2024/score/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/score/worst@16/std:np.float64(0.001053565375285274) - val-aux/aime2024/score/maj@16/mean:np.float64(0.1711333333333333) - val-aux/aime2024/score/maj@16/std:np.float64(0.06627669773850645) - val-core/aime2024/acc/mean@16:np.float64(0.1) - val-aux/aime2024/acc/std@16:np.float64(0.11773339488556522) - val-aux/aime2024/acc/best@2/mean:np.float64(0.14886666666666667) - val-aux/aime2024/acc/best@2/std:np.float64(0.11691805467363334) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.05023333333333334) - val-aux/aime2024/acc/worst@2/std:np.float64(0.08453444805557388) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.09963333333333332) - val-aux/aime2024/acc/maj@2/std:np.float64(0.11801865320122487) - val-aux/aime2024/acc/best@4/mean:np.float64(0.19556666666666667) - val-aux/aime2024/acc/best@4/std:np.float64(0.09963703312889909) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.0176) - val-aux/aime2024/acc/worst@4/std:np.float64(0.0463895371711608) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.12763333333333332) - val-aux/aime2024/acc/maj@4/std:np.float64(0.10996525109325783) - val-aux/aime2024/acc/best@8/mean:np.float64(0.23296666666666666) - val-aux/aime2024/acc/best@8/std:np.float64(0.07220099768908175) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0028) - val-aux/aime2024/acc/worst@8/std:np.float64(0.016913985403416938) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.15096666666666667) - val-aux/aime2024/acc/maj@8/std:np.float64(0.0906451137054987) - val-core/aime2024/acc/best@16/mean:np.float64(0.26313333333333333) - val-core/aime2024/acc/best@16/std:np.float64(0.05530171939039241) - val-aux/aime2024/acc/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2024/acc/worst@16/std:np.float64(0.001053565375285274) - val-core/aime2024/acc/maj@16/mean:np.float64(0.1711333333333333) - val-core/aime2024/acc/maj@16/std:np.float64(0.06627669773850645) - val-aux/aime2025/reward/mean@16:np.float64(0.05) - val-aux/aime2025/reward/std@16:np.float64(0.08598631667449093) - val-aux/aime2025/reward/best@2/mean:np.float64(0.08193333333333333) - val-aux/aime2025/reward/best@2/std:np.float64(0.09828577555768435) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.015466666666666667) - val-aux/aime2025/reward/worst@2/std:np.float64(0.046786987321973963) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.048) - val-aux/aime2025/reward/maj@2/std:np.float64(0.08588571564153905) - val-aux/aime2025/reward/best@4/mean:np.float64(0.12663333333333335) - val-aux/aime2025/reward/best@4/std:np.float64(0.09754631376474478) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0030000000000000005) - val-aux/aime2025/reward/worst@4/std:np.float64(0.01548514012475605) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.062400000000000004) - val-aux/aime2025/reward/maj@4/std:np.float64(0.08719334972443911) - val-aux/aime2025/reward/best@8/mean:np.float64(0.16846666666666665) - val-aux/aime2025/reward/best@8/std:np.float64(0.08243110119932637) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2025/reward/worst@8/std:np.float64(0.002876566563801983) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.07360000000000001) - val-aux/aime2025/reward/maj@8/std:np.float64(0.08337285232890626) - val-aux/aime2025/reward/best@16/mean:np.float64(0.20299999999999999) - val-aux/aime2025/reward/best@16/std:np.float64(0.05512753966791213) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.08613333333333334) - val-aux/aime2025/reward/maj@16/std:np.float64(0.07578965483329474) - val-aux/aime2025/score/mean@16:np.float64(0.05) - val-aux/aime2025/score/std@16:np.float64(0.08598631667449093) - val-aux/aime2025/score/best@2/mean:np.float64(0.08193333333333333) - val-aux/aime2025/score/best@2/std:np.float64(0.09828577555768435) - val-aux/aime2025/score/worst@2/mean:np.float64(0.015466666666666667) - val-aux/aime2025/score/worst@2/std:np.float64(0.046786987321973963) - val-aux/aime2025/score/maj@2/mean:np.float64(0.048) - val-aux/aime2025/score/maj@2/std:np.float64(0.08588571564153905) - val-aux/aime2025/score/best@4/mean:np.float64(0.12663333333333335) - val-aux/aime2025/score/best@4/std:np.float64(0.09754631376474478) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0030000000000000005) - val-aux/aime2025/score/worst@4/std:np.float64(0.01548514012475605) - val-aux/aime2025/score/maj@4/mean:np.float64(0.062400000000000004) - val-aux/aime2025/score/maj@4/std:np.float64(0.08719334972443911) - val-aux/aime2025/score/best@8/mean:np.float64(0.16846666666666665) - val-aux/aime2025/score/best@8/std:np.float64(0.08243110119932637) - val-aux/aime2025/score/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2025/score/worst@8/std:np.float64(0.002876566563801983) - val-aux/aime2025/score/maj@8/mean:np.float64(0.07360000000000001) - val-aux/aime2025/score/maj@8/std:np.float64(0.08337285232890626) - val-aux/aime2025/score/best@16/mean:np.float64(0.20299999999999999) - val-aux/aime2025/score/best@16/std:np.float64(0.05512753966791213) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.08613333333333334) - val-aux/aime2025/score/maj@16/std:np.float64(0.07578965483329474) - val-core/aime2025/acc/mean@16:np.float64(0.05) - val-aux/aime2025/acc/std@16:np.float64(0.08598631667449093) - val-aux/aime2025/acc/best@2/mean:np.float64(0.08193333333333333) - val-aux/aime2025/acc/best@2/std:np.float64(0.09828577555768435) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.015466666666666667) - val-aux/aime2025/acc/worst@2/std:np.float64(0.046786987321973963) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.048) - val-aux/aime2025/acc/maj@2/std:np.float64(0.08588571564153905) - val-aux/aime2025/acc/best@4/mean:np.float64(0.12663333333333335) - val-aux/aime2025/acc/best@4/std:np.float64(0.09754631376474478) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0030000000000000005) - val-aux/aime2025/acc/worst@4/std:np.float64(0.01548514012475605) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.062400000000000004) - val-aux/aime2025/acc/maj@4/std:np.float64(0.08719334972443911) - val-aux/aime2025/acc/best@8/mean:np.float64(0.16846666666666665) - val-aux/aime2025/acc/best@8/std:np.float64(0.08243110119932637) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2025/acc/worst@8/std:np.float64(0.002876566563801983) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.07360000000000001) - val-aux/aime2025/acc/maj@8/std:np.float64(0.08337285232890626) - val-core/aime2025/acc/best@16/mean:np.float64(0.20299999999999999) - val-core/aime2025/acc/best@16/std:np.float64(0.05512753966791213) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.08613333333333334) - val-core/aime2025/acc/maj@16/std:np.float64(0.07578965483329474) - val-aux/math500/reward/mean@4:np.float64(0.6895) - val-aux/math500/reward/std@4:np.float64(0.12827241335952166) - val-aux/math500/reward/best@2/mean:np.float64(0.7478579999999999) - val-aux/math500/reward/best@2/std:np.float64(0.1062501295633603) - val-aux/math500/reward/worst@2/mean:np.float64(0.631336) - val-aux/math500/reward/worst@2/std:np.float64(0.11451286251721333) - val-aux/math500/reward/maj@2/mean:np.float64(0.689508) - val-aux/math500/reward/maj@2/std:np.float64(0.12847369632040245) - val-aux/math500/reward/best@4/mean:np.float64(0.79049) - val-aux/math500/reward/best@4/std:np.float64(0.06523401220039318) - val-aux/math500/reward/worst@4/mean:np.float64(0.58258) - val-aux/math500/reward/worst@4/std:np.float64(0.07949196662674339) - val-aux/math500/reward/maj@4/mean:np.float64(0.703916) - val-aux/math500/reward/maj@4/std:np.float64(0.11838304283857166) - val-aux/math500/score/mean@4:np.float64(0.6895) - val-aux/math500/score/std@4:np.float64(0.12827241335952166) - val-aux/math500/score/best@2/mean:np.float64(0.7478579999999999) - val-aux/math500/score/best@2/std:np.float64(0.1062501295633603) - val-aux/math500/score/worst@2/mean:np.float64(0.631336) - val-aux/math500/score/worst@2/std:np.float64(0.11451286251721333) - val-aux/math500/score/maj@2/mean:np.float64(0.689508) - val-aux/math500/score/maj@2/std:np.float64(0.12847369632040245) - val-aux/math500/score/best@4/mean:np.float64(0.79049) - val-aux/math500/score/best@4/std:np.float64(0.06523401220039318) - val-aux/math500/score/worst@4/mean:np.float64(0.58258) - val-aux/math500/score/worst@4/std:np.float64(0.07949196662674339) - val-aux/math500/score/maj@4/mean:np.float64(0.703916) - val-aux/math500/score/maj@4/std:np.float64(0.11838304283857166) - val-core/math500/acc/mean@4:np.float64(0.6895) - val-aux/math500/acc/std@4:np.float64(0.12827241335952166) - val-aux/math500/acc/best@2/mean:np.float64(0.7478579999999999) - val-aux/math500/acc/best@2/std:np.float64(0.1062501295633603) - val-aux/math500/acc/worst@2/mean:np.float64(0.631336) - val-aux/math500/acc/worst@2/std:np.float64(0.11451286251721333) - val-aux/math500/acc/maj@2/mean:np.float64(0.689508) - val-aux/math500/acc/maj@2/std:np.float64(0.12847369632040245) - val-core/math500/acc/best@4/mean:np.float64(0.79049) - val-core/math500/acc/best@4/std:np.float64(0.06523401220039318) - val-aux/math500/acc/worst@4/mean:np.float64(0.58258) - val-aux/math500/acc/worst@4/std:np.float64(0.07949196662674339) - val-core/math500/acc/maj@4/mean:np.float64(0.703916) - val-core/math500/acc/maj@4/std:np.float64(0.11838304283857166) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.06283783783783783 - val-aux/aime2024/response_length/clip_ratio:0.13541666666666666 - val-aux/aime2025/response_length/clip_ratio:0.10833333333333334 - val-aux/math500/response_length/clip_ratio:0.0345 - training/global_step:175 - training/epoch:0 - critic/score/mean:0.56640625 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5550939440727234 - critic/rewards/max:1.028234601020813 - critic/rewards/min:-0.06544473767280579 - critic/advantages/mean:-0.12876813113689423 - critic/advantages/max:2.4748375415802 - critic/advantages/min:-2.4748153686523438 - critic/returns/mean:-0.12876813113689423 - critic/returns/max:2.4748375415802 - critic/returns/min:-2.4748153686523438 - response_length/mean:1174.53515625 - response_length/max:8192.0 - response_length/min:188.0 - response_length/clip_ratio:0.013020833022892475 - response_length_non_aborted/mean:1174.53515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:188.0 - response_length_non_aborted/clip_ratio:0.013020833022892475 - response/aborted_ratio:0.0 - prompt_length/mean:238.5520782470703 - prompt_length/max:555.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.268002420663834e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6416036961600184) - timing_s/agent_loop/generate_sequences/max:np.float64(29.6904979608953) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.563814556961006) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.6904979608953) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.757556236349046 - timing_s/reward:0.00017256569117307663 - timing_s/old_log_prob:10.166140406392515 - timing_s/ref:20.994766537100077 - timing_s/adv:0.0747922109439969 - timing_s/update_actor:21.152197692543268 - timing_s/save_checkpoint:56.14761977363378 - timing_s/update_weights:29.82403076812625 - timing_s/step:170.51606599614024 - timing_s/testing:131.21018296293914 - timing_s/stop_profile:0.0004524877294898033 - timing_per_token_ms/adv:6.891697030824842e-05 - timing_per_token_ms/update_actor:0.019490604194369106 - timing_per_token_ms/gen:0.03520625539619402 - timing_per_token_ms/ref:0.019345539913900173 - perf/total_num_tokens:1636461 - perf/time_per_step:170.51606599614024 - perf/throughput:2399.2768517733725 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:621.0 - frontier/mean_score:2.5329294140321186 - frontier/mean_frontier_pct:0.28194374191456745 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:4.18319597 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.530215586999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.8319299999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.1880699999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.4589371 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.3825509999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.2699899999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.2319299999999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:2.183 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:3.2519299999999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.1410562999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.8319299999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.3376456999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.7437299999999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.8823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.9176456999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.238491 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.339899999999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.7382909999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.1598999999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.1481519899999992 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.8319299999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:1.411649 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.0155080099999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.2319299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.8319299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.8929394099999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9682350999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9903110989592996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.0569999999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:175.0 - cluster/prob_snapshot/cluster_0:0.02707417851237623 - cluster/prob_snapshot/cluster_1:0.016375878387842963 - cluster/prob_snapshot/cluster_2:0.008026085560891799 - cluster/prob_snapshot/cluster_3:0.02221188921779807 - cluster/prob_snapshot/cluster_4:0.011002617090166969 - cluster/prob_snapshot/cluster_5:0.011443174822712654 - cluster/prob_snapshot/cluster_6:0.008692067501231906 - cluster/prob_snapshot/cluster_7:0.01832861259773914 - cluster/prob_snapshot/cluster_8:0.01331316667910203 - cluster/prob_snapshot/cluster_9:0.014161468456753904 - cluster/prob_snapshot/cluster_10:0.028858810317905614 - cluster/prob_snapshot/cluster_11:0.013526746893205273 - cluster/prob_snapshot/cluster_12:0.015420174323996703 - cluster/prob_snapshot/cluster_13:0.017862425239501066 - cluster/prob_snapshot/cluster_14:0.01719644329916096 - cluster/prob_snapshot/cluster_15:0.021163792858044164 - cluster/prob_snapshot/cluster_16:0.01444533597768021 - cluster/prob_snapshot/cluster_17:0.016245040527246523 - cluster/prob_snapshot/cluster_18:0.014128654769314406 - cluster/prob_snapshot/cluster_19:0.01719644329916096 - cluster/prob_snapshot/cluster_20:0.02104690623178039 - cluster/prob_snapshot/cluster_21:0.02032931748679801 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01832861259773914 - cluster/prob_snapshot/cluster_24:0.021601669188083702 - cluster/prob_snapshot/cluster_25:0.008692067501231906 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.017757770934590477 - cluster/prob_snapshot/cluster_28:0.018654943748505792 - cluster/prob_snapshot/cluster_29:0.015293637755332084 - cluster/prob_snapshot/cluster_30:0.01331316667910203 - cluster/prob_snapshot/cluster_31:0.018883375554042448 - cluster/prob_snapshot/cluster_32:0.014487799607520557 - cluster/prob_snapshot/cluster_33:0.015144131605459812 - cluster/prob_snapshot/cluster_34:0.017722569032029643 - cluster/prob_snapshot/cluster_35:0.019914736933202212 - cluster/prob_snapshot/cluster_36:0.012361763907187594 - cluster/prob_snapshot/cluster_37:0.013979148619442135 - cluster/prob_snapshot/cluster_38:0.01926098731421529 - cluster/prob_snapshot/cluster_39:0.018654943748505792 - cluster/prob_snapshot/cluster_40:0.01876917033028483 - cluster/prob_snapshot/cluster_41:0.02037524169859832 - cluster/prob_snapshot/cluster_42:0.016064274000582777 - cluster/prob_snapshot/cluster_43:0.01832861259773914 - cluster/prob_snapshot/cluster_44:0.017862425239501066 - cluster/prob_snapshot/cluster_45:0.018586414206844797 - cluster/prob_snapshot/cluster_46:0.012944255400196435 - cluster/prob_snapshot/cluster_47:0.017862425239501066 - cluster/prob_snapshot/cluster_48:0.012361763907187594 - cluster/prob_snapshot/cluster_49:0.02276665217410138 - cluster/prob_snapshot/cluster_50:0.009136372595715947 - cluster/prob_snapshot/cluster_51:0.012361763907187594 - cluster/prob_snapshot/cluster_52:0.013979148619442135 - cluster/prob_snapshot/cluster_53:0.01951675292138905 - cluster/prob_snapshot/cluster_54:0.01444533597768021 - cluster/prob_snapshot/cluster_55:0.01832861259773914 - cluster/prob_snapshot/cluster_56:0.01872347329016679 - cluster/prob_snapshot/cluster_57:0.017862425239501066 - cluster/prob_snapshot/cluster_58:0.012738668911015582 - cluster/prob_snapshot/cluster_59:0.019353675295485623 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013526746893205273 - cluster/prob_snapshot/cluster_62:0.01331316667910203 - cluster/prob_snapshot/cluster_63:0.008692067501231906
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▏       | 176/800 [5:35:33<26:33:36, 153.23s/it]
[36m(TaskRunner pid=2823680)[0m step:176 - global_seqlen/min:321554 - global_seqlen/max:409806 - global_seqlen/minmax_diff:88252 - global_seqlen/balanced_min:361105 - global_seqlen/balanced_max:361161 - global_seqlen/mean:361127.25 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.21268244730835817) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012615857645869255 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0503021401600563) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003686766758619342) - actor/ppo_kl:np.float64(2.43148586451551e-05) - actor/pg_clipfrac_lower:np.float64(6.614121305591532e-06) - actor/grad_norm:np.float64(0.224867045879364) - perf/mfu/actor:np.float64(0.1412148349262086) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(156.26177215576172) - actor/lr:np.float64(1e-06) - training/global_step:176 - training/epoch:0 - critic/score/mean:0.5440475940704346 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5334063172340393 - critic/rewards/max:1.0019662380218506 - critic/rewards/min:-0.055611103773117065 - critic/advantages/mean:-0.12319725751876831 - critic/advantages/max:2.4748406410217285 - critic/advantages/min:-2.4748270511627197 - critic/returns/mean:-0.12319725751876831 - critic/returns/max:2.4748406410217285 - critic/returns/min:-2.4748270511627197 - response_length/mean:1114.5428466796875 - response_length/max:8192.0 - response_length/min:137.0 - response_length/clip_ratio:0.009523809887468815 - response_length_non_aborted/mean:1114.5428466796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:137.0 - response_length_non_aborted/clip_ratio:0.009523809887468815 - response/aborted_ratio:0.0 - prompt_length/mean:238.58094787597656 - prompt_length/max:417.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.279923349618912e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1294072829186916) - timing_s/agent_loop/generate_sequences/max:np.float64(27.646789822727442) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.518688334224862) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.646789822727442) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:281 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.464989136904478 - timing_s/reward:0.00020973198115825653 - timing_s/old_log_prob:10.862761782482266 - timing_s/ref:11.45103311073035 - timing_s/adv:0.12644134182482958 - timing_s/update_actor:30.26840458344668 - timing_s/update_weights:24.184665698558092 - timing_s/step:107.80332235712558 - timing_s/stop_profile:7.528159767389297e-05 - timing_per_token_ms/adv:0.00011124289283424385 - timing_per_token_ms/update_actor:0.02663009454617066 - timing_per_token_ms/gen:0.03254055595813837 - timing_per_token_ms/ref:0.010074600844897126 - perf/total_num_tokens:1444509 - perf/time_per_step:107.80332235712558 - perf/throughput:3349.8712479720734 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:644.0 - frontier/mean_score:2.566588238686217 - frontier/mean_frontier_pct:0.30323094634003 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:4.18319597 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.8823509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.1880699999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.4589371 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.09 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.2699899999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8623509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:2.183 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.7598999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:3.2519299999999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.1410562999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.8319299999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.8363519899999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.7437299999999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.8823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9423519899999997 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.238491 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.339899999999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8168036999999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.4119299999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.1481519899999992 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.8823509999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.0155080099999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:2.2319299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.8319299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.8929394099999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9682350999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9903110989592996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.0569999999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:176.0 - cluster/prob_snapshot/cluster_0:0.026719121548635018 - cluster/prob_snapshot/cluster_1:0.01706131062826689 - cluster/prob_snapshot/cluster_2:0.007920829640802672 - cluster/prob_snapshot/cluster_3:0.021920597426949368 - cluster/prob_snapshot/cluster_4:0.010858326255434677 - cluster/prob_snapshot/cluster_5:0.011293106413203757 - cluster/prob_snapshot/cluster_6:0.008578077741793394 - cluster/prob_snapshot/cluster_7:0.01841029855334023 - cluster/prob_snapshot/cluster_8:0.013138574769075956 - cluster/prob_snapshot/cluster_9:0.013975751723369971 - cluster/prob_snapshot/cluster_10:0.028480349284859856 - cluster/prob_snapshot/cluster_11:0.01334935404344616 - cluster/prob_snapshot/cluster_12:0.01640109110861159 - cluster/prob_snapshot/cluster_13:0.01762817331316127 - cluster/prob_snapshot/cluster_14:0.016970925212170546 - cluster/prob_snapshot/cluster_15:0.02088624604235814 - cluster/prob_snapshot/cluster_16:0.011895302800079423 - cluster/prob_snapshot/cluster_17:0.016031999353612374 - cluster/prob_snapshot/cluster_18:0.013943368362125822 - cluster/prob_snapshot/cluster_19:0.01762817331316127 - cluster/prob_snapshot/cluster_20:0.020770892294020987 - cluster/prob_snapshot/cluster_21:0.02006271417181676 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.018088246983854774 - cluster/prob_snapshot/cluster_24:0.02450374208123886 - cluster/prob_snapshot/cluster_25:0.007920829640802672 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01752489146871987 - cluster/prob_snapshot/cluster_28:0.01841029855334023 - cluster/prob_snapshot/cluster_29:0.015093073495054198 - cluster/prob_snapshot/cluster_30:0.013138574769075956 - cluster/prob_snapshot/cluster_31:0.018793539921027922 - cluster/prob_snapshot/cluster_32:0.014297803292855425 - cluster/prob_snapshot/cluster_33:0.014945528002995053 - cluster/prob_snapshot/cluster_34:0.01799163151300914 - cluster/prob_snapshot/cluster_35:0.019653570522336763 - cluster/prob_snapshot/cluster_36:0.014288279902004337 - cluster/prob_snapshot/cluster_37:0.015405601673688561 - cluster/prob_snapshot/cluster_38:0.019008394325241786 - cluster/prob_snapshot/cluster_39:0.018635734651980047 - cluster/prob_snapshot/cluster_40:0.02254699510687318 - cluster/prob_snapshot/cluster_41:0.020108036123009362 - cluster/prob_snapshot/cluster_42:0.01585360344048632 - cluster/prob_snapshot/cluster_43:0.01841029855334023 - cluster/prob_snapshot/cluster_44:0.01762817331316127 - cluster/prob_snapshot/cluster_45:0.018342667723748284 - cluster/prob_snapshot/cluster_46:0.012774501476981972 - cluster/prob_snapshot/cluster_47:0.01762817331316127 - cluster/prob_snapshot/cluster_48:0.012199648910517783 - cluster/prob_snapshot/cluster_49:0.02246808509507464 - cluster/prob_snapshot/cluster_50:0.008227764503965338 - cluster/prob_snapshot/cluster_51:0.012199648910517783 - cluster/prob_snapshot/cluster_52:0.013795822870066678 - cluster/prob_snapshot/cluster_53:0.019260805763797982 - cluster/prob_snapshot/cluster_54:0.014255896540760182 - cluster/prob_snapshot/cluster_55:0.018088246983854774 - cluster/prob_snapshot/cluster_56:0.018477929382932173 - cluster/prob_snapshot/cluster_57:0.01762817331316127 - cluster/prob_snapshot/cluster_58:0.012571611095998877 - cluster/prob_snapshot/cluster_59:0.01909986677514558 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.015093073495054198 - cluster/prob_snapshot/cluster_62:0.013138574769075956 - cluster/prob_snapshot/cluster_63:0.008578077741793394
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▏       | 177/800 [5:37:23<24:17:33, 140.37s/it]
[36m(TaskRunner pid=2823680)[0m step:177 - global_seqlen/min:323202 - global_seqlen/max:450391 - global_seqlen/minmax_diff:127189 - global_seqlen/balanced_min:387910 - global_seqlen/balanced_max:388118 - global_seqlen/mean:388004.5 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.18207182849829012) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012546216137707233 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03617384885365027) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00036462947921861016) - actor/ppo_kl:np.float64(2.6578383306321903e-05) - actor/pg_clipfrac_lower:np.float64(1.0325495797867636e-06) - actor/grad_norm:np.float64(0.2160936387685629) - perf/mfu/actor:np.float64(0.18716809644140622) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(164.2591438293457) - actor/lr:np.float64(1e-06) - training/global_step:177 - training/epoch:0 - critic/score/mean:0.614182710647583 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6030375361442566 - critic/rewards/max:1.0131760835647583 - critic/rewards/min:-0.08061549812555313 - critic/advantages/mean:-0.14114296436309814 - critic/advantages/max:2.474731683731079 - critic/advantages/min:-2.4748497009277344 - critic/returns/mean:-0.14114296436309814 - critic/returns/max:2.474731683731079 - critic/returns/min:-2.4748497009277344 - response_length/mean:1206.06005859375 - response_length/max:8192.0 - response_length/min:70.0 - response_length/clip_ratio:0.014423076994717121 - response_length_non_aborted/mean:1206.06005859375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:70.0 - response_length_non_aborted/clip_ratio:0.014423076994717121 - response/aborted_ratio:0.0 - prompt_length/mean:241.39422607421875 - prompt_length/max:478.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.648447692394257e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.6234140861779451) - timing_s/agent_loop/generate_sequences/max:np.float64(28.221538464538753) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.9702882361179945) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.221538464538753) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:254 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.015012717805803 - timing_s/reward:0.00024126004427671432 - timing_s/old_log_prob:10.740054400637746 - timing_s/ref:16.129102243110538 - timing_s/adv:0.08904958982020617 - timing_s/update_actor:24.719171326607466 - timing_s/update_weights:28.041710537858307 - timing_s/step:110.14841446839273 - timing_s/stop_profile:5.568843334913254e-05 - timing_per_token_ms/adv:7.394413419797536e-05 - timing_per_token_ms/update_actor:0.02052606559477553 - timing_per_token_ms/gen:0.0299120554230397 - timing_per_token_ms/ref:0.013393127393011385 - perf/total_num_tokens:1552018 - perf/time_per_step:110.14841446839273 - perf/throughput:3522.5609181268655 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:668.0 - frontier/mean_score:2.6042514121288396 - frontier/mean_frontier_pct:0.324184913486332 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:4.18319597 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.7680699999999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.8823509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.4316489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.4589371 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.3629999999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.2699899999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8623509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.4319299999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.1763509999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.1410562999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.8319299999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:4.185446392999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:1.2401 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.7437299999999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.8823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9423519899999997 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.238491 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.5379299999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8168036999999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.0538999999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.4119299999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.1481519899999992 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:2.6374489999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.8823509999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.1598999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.0155080099999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.8623509999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.9250575869999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.8319299999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9682350999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9903110989592996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.0569999999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:177.0 - cluster/prob_snapshot/cluster_0:0.026332704590412797 - cluster/prob_snapshot/cluster_1:0.01681456674695107 - cluster/prob_snapshot/cluster_2:0.007806277113661234 - cluster/prob_snapshot/cluster_3:0.02160357762655221 - cluster/prob_snapshot/cluster_4:0.010701291100092008 - cluster/prob_snapshot/cluster_5:0.011129783385493926 - cluster/prob_snapshot/cluster_6:0.008454019969072685 - cluster/prob_snapshot/cluster_7:0.018144045355083113 - cluster/prob_snapshot/cluster_8:0.012948562231111327 - cluster/prob_snapshot/cluster_9:0.015306931648380958 - cluster/prob_snapshot/cluster_10:0.02806846112005886 - cluster/prob_snapshot/cluster_11:0.014874794629127887 - cluster/prob_snapshot/cluster_12:0.016163895446090308 - cluster/prob_snapshot/cluster_13:0.017373231357143488 - cluster/prob_snapshot/cluster_14:0.017373231357143488 - cluster/prob_snapshot/cluster_15:0.020584185226111683 - cluster/prob_snapshot/cluster_16:0.01172327069502791 - cluster/prob_snapshot/cluster_17:0.015800141565429963 - cluster/prob_snapshot/cluster_18:0.011507664858869528 - cluster/prob_snapshot/cluster_19:0.02160357762655221 - cluster/prob_snapshot/cluster_20:0.019994739227687262 - cluster/prob_snapshot/cluster_21:0.019772563428281133 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.017826651355931502 - cluster/prob_snapshot/cluster_24:0.02634687072666064 - cluster/prob_snapshot/cluster_25:0.007806277113661234 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.017271443194150256 - cluster/prob_snapshot/cluster_28:0.018144045355083113 - cluster/prob_snapshot/cluster_29:0.014874794629127887 - cluster/prob_snapshot/cluster_30:0.012948562231111327 - cluster/prob_snapshot/cluster_31:0.018521744214073533 - cluster/prob_snapshot/cluster_32:0.014091025774080033 - cluster/prob_snapshot/cluster_33:0.015975957483327353 - cluster/prob_snapshot/cluster_34:0.01773143315618602 - cluster/prob_snapshot/cluster_35:0.019223925229747633 - cluster/prob_snapshot/cluster_36:0.014081640112297542 - cluster/prob_snapshot/cluster_37:0.015182802966497006 - cluster/prob_snapshot/cluster_38:0.018733491353507534 - cluster/prob_snapshot/cluster_39:0.018366221154489243 - cluster/prob_snapshot/cluster_40:0.02222091622548517 - cluster/prob_snapshot/cluster_41:0.019817229924896432 - cluster/prob_snapshot/cluster_42:0.01660241735920386 - cluster/prob_snapshot/cluster_43:0.018144045355083113 - cluster/prob_snapshot/cluster_44:0.017373231357143488 - cluster/prob_snapshot/cluster_45:0.018077392615261274 - cluster/prob_snapshot/cluster_46:0.014478217370712716 - cluster/prob_snapshot/cluster_47:0.017826651355931502 - cluster/prob_snapshot/cluster_48:0.012023215294809255 - cluster/prob_snapshot/cluster_49:0.022143147425109953 - cluster/prob_snapshot/cluster_50:0.008108773027138381 - cluster/prob_snapshot/cluster_51:0.012023215294809255 - cluster/prob_snapshot/cluster_52:0.013596305086522778 - cluster/prob_snapshot/cluster_53:0.01898225237039362 - cluster/prob_snapshot/cluster_54:0.01172327069502791 - cluster/prob_snapshot/cluster_55:0.021920971625703823 - cluster/prob_snapshot/cluster_56:0.018412878072364528 - cluster/prob_snapshot/cluster_57:0.017826651355931502 - cluster/prob_snapshot/cluster_58:0.012389798093246293 - cluster/prob_snapshot/cluster_59:0.018823640911646768 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014874794629127887 - cluster/prob_snapshot/cluster_62:0.012948562231111327 - cluster/prob_snapshot/cluster_63:0.008454019969072685
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▏       | 178/800 [5:39:13<22:39:22, 131.13s/it]
[36m(TaskRunner pid=2823680)[0m step:178 - global_seqlen/min:317103 - global_seqlen/max:375208 - global_seqlen/minmax_diff:58105 - global_seqlen/balanced_min:346706 - global_seqlen/balanced_max:346768 - global_seqlen/mean:346747.75 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.2221190086669392) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013730027712881565 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0013442097697407007) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004913282782783629) - actor/ppo_kl:np.float64(9.000524069304245e-05) - actor/pg_clipfrac_lower:np.float64(6.27288589182879e-06) - actor/grad_norm:np.float64(0.24179031948248544) - perf/mfu/actor:np.float64(0.204810326301208) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(161.2502498626709) - actor/lr:np.float64(1e-06) - training/global_step:178 - training/epoch:0 - critic/score/mean:0.6097221970558167 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5991016626358032 - critic/rewards/max:1.0148355960845947 - critic/rewards/min:-0.06257233768701553 - critic/advantages/mean:-0.08776940405368805 - critic/advantages/max:2.474778413772583 - critic/advantages/min:-2.474836826324463 - critic/returns/mean:-0.08776940405368805 - critic/returns/max:2.474778413772583 - critic/returns/min:-2.474836826324463 - response_length/mean:1050.88330078125 - response_length/max:8192.0 - response_length/min:158.0 - response_length/clip_ratio:0.0055555556900799274 - response_length_non_aborted/mean:1050.88330078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:158.0 - response_length_non_aborted/clip_ratio:0.0055555556900799274 - response/aborted_ratio:0.0 - prompt_length/mean:239.10000610351562 - prompt_length/max:461.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.102560579776764e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3774858731776476) - timing_s/agent_loop/generate_sequences/max:np.float64(27.51485183928162) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.113819396687177) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.51485183928162) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.150676446035504 - timing_s/reward:0.0001536952331662178 - timing_s/old_log_prob:9.6542610200122 - timing_s/ref:21.596344001591206 - timing_s/adv:0.09804532770067453 - timing_s/update_actor:20.165834761224687 - timing_s/update_weights:28.144925815053284 - timing_s/step:109.29569942038506 - timing_s/stop_profile:8.14143568277359e-05 - timing_per_token_ms/adv:0.00010556265552599143 - timing_per_token_ms/update_actor:0.021711988915904045 - timing_per_token_ms/gen:0.038526684490343445 - timing_per_token_ms/ref:0.02325217810909616 - perf/total_num_tokens:1386991 - perf/time_per_step:109.29569942038506 - perf/throughput:3172.5653602005045 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:706.0 - frontier/mean_score:2.626588614197654 - frontier/mean_frontier_pct:0.3341205465536495 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.828237179 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.1376489999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.343 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.8823509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.0569999999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.4316489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.4589371 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.3629999999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.2699899999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8623509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.1234456999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.1410562999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.8319299999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:4.185446392999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.8206109999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.9176456999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:2.3629999999999995 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9423519899999997 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.4669437 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.5379299999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8168036999999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.0377299999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.3709999999999996 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.103706392999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:2.746214299999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.9176456999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:1.91 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:1.8119299999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.0155080099999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.8623509999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.9250575869999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9682350999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:2.9903110989592996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.0569999999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:178.0 - cluster/prob_snapshot/cluster_0:0.02429156687821232 - cluster/prob_snapshot/cluster_1:0.016949430758852437 - cluster/prob_snapshot/cluster_2:0.007868888649563763 - cluster/prob_snapshot/cluster_3:0.021776852691796918 - cluster/prob_snapshot/cluster_4:0.010787122574194336 - cluster/prob_snapshot/cluster_5:0.013564165755061143 - cluster/prob_snapshot/cluster_6:0.008521826833613525 - cluster/prob_snapshot/cluster_7:0.018289572669912713 - cluster/prob_snapshot/cluster_8:0.013052418314775143 - cluster/prob_snapshot/cluster_9:0.015429703423774753 - cluster/prob_snapshot/cluster_10:0.028293588851954488 - cluster/prob_snapshot/cluster_11:0.014994100378130124 - cluster/prob_snapshot/cluster_12:0.01629354064127259 - cluster/prob_snapshot/cluster_13:0.017512576230893497 - cluster/prob_snapshot/cluster_14:0.017512576230893497 - cluster/prob_snapshot/cluster_15:0.020749284086111607 - cluster/prob_snapshot/cluster_16:0.011817299125396112 - cluster/prob_snapshot/cluster_17:0.01592686921248693 - cluster/prob_snapshot/cluster_18:0.011599963986990978 - cluster/prob_snapshot/cluster_19:0.020954626482360725 - cluster/prob_snapshot/cluster_20:0.019819406835141308 - cluster/prob_snapshot/cluster_21:0.019931152541497254 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01796963295972833 - cluster/prob_snapshot/cluster_24:0.026558190158241502 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.017897809759482856 - cluster/prob_snapshot/cluster_28:0.018513530467041784 - cluster/prob_snapshot/cluster_29:0.014994100378130124 - cluster/prob_snapshot/cluster_30:0.013052418314775143 - cluster/prob_snapshot/cluster_31:0.01867030092503213 - cluster/prob_snapshot/cluster_32:0.015653661220903822 - cluster/prob_snapshot/cluster_33:0.01610409529101472 - cluster/prob_snapshot/cluster_34:0.017873651046673013 - cluster/prob_snapshot/cluster_35:0.019275509327827855 - cluster/prob_snapshot/cluster_36:0.014194584234395725 - cluster/prob_snapshot/cluster_37:0.016424035001199103 - cluster/prob_snapshot/cluster_38:0.018883746417397998 - cluster/prob_snapshot/cluster_39:0.018513530467041784 - cluster/prob_snapshot/cluster_40:0.021390229528005355 - cluster/prob_snapshot/cluster_41:0.01969415370329504 - cluster/prob_snapshot/cluster_42:0.017425735452414874 - cluster/prob_snapshot/cluster_43:0.018513530467041784 - cluster/prob_snapshot/cluster_44:0.017512576230893497 - cluster/prob_snapshot/cluster_45:0.018222385330773992 - cluster/prob_snapshot/cluster_46:0.014594342306262924 - cluster/prob_snapshot/cluster_47:0.01796963295972833 - cluster/prob_snapshot/cluster_48:0.012119649480418342 - cluster/prob_snapshot/cluster_49:0.02232075019911037 - cluster/prob_snapshot/cluster_50:0.008173810781515002 - cluster/prob_snapshot/cluster_51:0.012119649480418342 - cluster/prob_snapshot/cluster_52:0.011497359415211728 - cluster/prob_snapshot/cluster_53:0.01913450266313814 - cluster/prob_snapshot/cluster_54:0.011817299125396112 - cluster/prob_snapshot/cluster_55:0.022096792401981303 - cluster/prob_snapshot/cluster_56:0.018560561604438888 - cluster/prob_snapshot/cluster_57:0.018289572669912713 - cluster/prob_snapshot/cluster_58:0.012489172516783319 - cluster/prob_snapshot/cluster_59:0.018974619034969255 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014994100378130124 - cluster/prob_snapshot/cluster_62:0.013052418314775143 - cluster/prob_snapshot/cluster_63:0.008521826833613525
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▏       | 179/800 [5:41:14<22:06:13, 128.14s/it]
[36m(TaskRunner pid=2823680)[0m step:179 - global_seqlen/min:300082 - global_seqlen/max:434083 - global_seqlen/minmax_diff:134001 - global_seqlen/balanced_min:380696 - global_seqlen/balanced_max:380816 - global_seqlen/mean:380736.75 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.2164663218107878) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014197293668985367 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07985147123690695) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004588911905040161) - actor/ppo_kl:np.float64(1.9203002789610908e-05) - actor/pg_clipfrac_lower:np.float64(6.364529790338494e-07) - actor/grad_norm:np.float64(0.2661725523380133) - perf/mfu/actor:np.float64(0.20261797225416067) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(164.31396865844727) - actor/lr:np.float64(1e-06) - training/global_step:179 - training/epoch:0 - critic/score/mean:0.5631188154220581 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5515424609184265 - critic/rewards/max:1.0326640605926514 - critic/rewards/min:-0.05623422935605049 - critic/advantages/mean:-0.15305925905704498 - critic/advantages/max:2.4748497009277344 - critic/advantages/min:-2.4748435020446777 - critic/returns/mean:-0.15305925905704498 - critic/returns/max:2.4748497009277344 - critic/returns/min:-2.4748435020446777 - response_length/mean:1135.806884765625 - response_length/max:8192.0 - response_length/min:168.0 - response_length/clip_ratio:0.008663366548717022 - response_length_non_aborted/mean:1135.806884765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:168.0 - response_length_non_aborted/clip_ratio:0.008663366548717022 - response/aborted_ratio:0.0 - prompt_length/mean:231.3069305419922 - prompt_length/max:414.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00011131074279546738 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.538199307397008) - timing_s/agent_loop/generate_sequences/max:np.float64(28.81728009879589) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.082914899598109) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.81728009879589) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:244 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.406179709360003 - timing_s/reward:0.00015053711831569672 - timing_s/old_log_prob:11.90291914716363 - timing_s/ref:23.04186170361936 - timing_s/adv:0.09645277727395296 - timing_s/update_actor:23.203244055621326 - timing_s/update_weights:31.81082790810615 - timing_s/step:120.90865866001695 - timing_s/stop_profile:5.426537245512009e-05 - timing_per_token_ms/adv:8.731697664186764e-05 - timing_per_token_ms/update_actor:0.021005482438994236 - timing_per_token_ms/gen:0.033131872604812734 - timing_per_token_ms/ref:0.020859385877978254 - perf/total_num_tokens:1522947 - perf/time_per_step:120.90865866001695 - perf/throughput:3148.961821424168 - frontier/active_count:59.0 - frontier/completed_count:5.0 - frontier/blacklisted_count:733.0 - frontier/mean_score:2.637103302126635 - frontier/mean_frontier_pct:0.3439436364080097 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:4.179766025299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.1376489999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.9176456999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.339899999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.4316489999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:4.62125597 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:1.9540999999999997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.8319299999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.1889929999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8623509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.1234456999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.1410562999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.2823509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:4.185446392999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.8206109999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.9176456999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9423519899999997 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.4669437 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.5379299999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8168036999999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.0377299999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.103706392999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:2.746214299999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9423519899999997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.237 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.91 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:1.5683509999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.0108556069999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.8623509999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.9250575869999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9682350999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:2.9932177692715096 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.0569999999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:179.0 - cluster/prob_snapshot/cluster_0:0.0268641322837475 - cluster/prob_snapshot/cluster_1:0.017167982845432077 - cluster/prob_snapshot/cluster_2:0.007970352944022548 - cluster/prob_snapshot/cluster_3:0.022057651301652527 - cluster/prob_snapshot/cluster_4:0.01092621563167352 - cluster/prob_snapshot/cluster_5:0.013739067011077214 - cluster/prob_snapshot/cluster_6:0.007970352944022548 - cluster/prob_snapshot/cluster_7:0.018752250620602956 - cluster/prob_snapshot/cluster_8:0.01503897173914874 - cluster/prob_snapshot/cluster_9:0.015628659596790166 - cluster/prob_snapshot/cluster_10:0.02970167012786975 - cluster/prob_snapshot/cluster_11:0.012559363509325424 - cluster/prob_snapshot/cluster_12:0.016503635443604545 - cluster/prob_snapshot/cluster_13:0.017738389718738673 - cluster/prob_snapshot/cluster_14:0.018201339902238343 - cluster/prob_snapshot/cluster_15:0.02049625009758672 - cluster/prob_snapshot/cluster_16:0.011969675651684003 - cluster/prob_snapshot/cluster_17:0.016132236020882665 - cluster/prob_snapshot/cluster_18:0.011749538115448446 - cluster/prob_snapshot/cluster_19:0.021224823010278043 - cluster/prob_snapshot/cluster_20:0.02007496543060202 - cluster/prob_snapshot/cluster_21:0.020188152026486224 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01466909363127393 - cluster/prob_snapshot/cluster_24:0.02690064106160479 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.018128590587688394 - cluster/prob_snapshot/cluster_28:0.018752250620602956 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013220720914324956 - cluster/prob_snapshot/cluster_31:0.018911042533543344 - cluster/prob_snapshot/cluster_32:0.015855505186705005 - cluster/prob_snapshot/cluster_33:0.016311747316525394 - cluster/prob_snapshot/cluster_34:0.018104120363703412 - cluster/prob_snapshot/cluster_35:0.019524054712237407 - cluster/prob_snapshot/cluster_36:0.014377614334149214 - cluster/prob_snapshot/cluster_37:0.016635812444975164 - cluster/prob_snapshot/cluster_38:0.019127240269237688 - cluster/prob_snapshot/cluster_39:0.018752250620602956 - cluster/prob_snapshot/cluster_40:0.024807008513864873 - cluster/prob_snapshot/cluster_41:0.019948097239600955 - cluster/prob_snapshot/cluster_42:0.01765042918387373 - cluster/prob_snapshot/cluster_43:0.018911042533543344 - cluster/prob_snapshot/cluster_44:0.017738389718738673 - cluster/prob_snapshot/cluster_45:0.018457351353713663 - cluster/prob_snapshot/cluster_46:0.014782527031087702 - cluster/prob_snapshot/cluster_47:0.018201339902238343 - cluster/prob_snapshot/cluster_48:0.014377614334149214 - cluster/prob_snapshot/cluster_49:0.02260856202001714 - cluster/prob_snapshot/cluster_50:0.00827920685215733 - cluster/prob_snapshot/cluster_51:0.012275924621468483 - cluster/prob_snapshot/cluster_52:0.010080083065971053 - cluster/prob_snapshot/cluster_53:0.01935132799877368 - cluster/prob_snapshot/cluster_54:0.011969675651684003 - cluster/prob_snapshot/cluster_55:0.0223817164301023 - cluster/prob_snapshot/cluster_56:0.01879988819448507 - cluster/prob_snapshot/cluster_57:0.018525405030688114 - cluster/prob_snapshot/cluster_58:0.012650212421428523 - cluster/prob_snapshot/cluster_59:0.01923796634095136 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01518743972802619 - cluster/prob_snapshot/cluster_62:0.013220720914324956 - cluster/prob_snapshot/cluster_63:0.008631710349022081
[36m(TaskRunner pid=2823680)[0m step:180 - global_seqlen/min:365954 - global_seqlen/max:418117 - global_seqlen/minmax_diff:52163 - global_seqlen/balanced_min:386218 - global_seqlen/balanced_max:386434 - global_seqlen/mean:386321.0 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.2027529946062714) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011728160083293915 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.020435499187442474) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00034643960983279004) - actor/ppo_kl:np.float64(6.0687906530354496e-05) - actor/pg_clipfrac_lower:np.float64(1.1799946832979913e-06) - actor/grad_norm:np.float64(0.2753603234887123) - perf/mfu/actor:np.float64(0.20653358930289062) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(165.00806427001953) - actor/lr:np.float64(1e-06) - training/global_step:180 - training/epoch:0 - critic/score/mean:0.6302631497383118 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6193496584892273 - critic/rewards/max:1.032464623451233 - critic/rewards/min:-0.08042769134044647 - critic/advantages/mean:-0.10226745158433914 - critic/advantages/max:2.4748451709747314 - critic/advantages/min:-2.4748029708862305 - critic/returns/mean:-0.10226745158433914 - critic/returns/max:2.4748451709747314 - critic/returns/min:-2.4748029708862305 - response_length/mean:1192.1026611328125 - response_length/max:8192.0 - response_length/min:207.0 - response_length/clip_ratio:0.01184210553765297 - response_length_non_aborted/mean:1192.1026611328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:207.0 - response_length_non_aborted/clip_ratio:0.01184210553765297 - response/aborted_ratio:0.0 - prompt_length/mean:245.87368774414062 - prompt_length/max:417.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010944437235593796 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6405758913606405) - timing_s/agent_loop/generate_sequences/max:np.float64(28.508512766100466) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.086333971667045) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.508512766100466) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:291 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.99405154120177 - timing_s/reward:0.00017127953469753265 - timing_s/old_log_prob:10.962117405608296 - timing_s/ref:23.022348253056407 - timing_s/adv:0.13493635971099138 - timing_s/update_actor:22.103187668137252 - timing_s/update_weights:30.819441261701286 - timing_s/step:118.46808672882617 - timing_s/stop_profile:5.483906716108322e-05 - timing_per_token_ms/adv:0.00012347063006215917 - timing_per_token_ms/update_actor:0.02022504915363262 - timing_per_token_ms/gen:0.034209845431448824 - timing_per_token_ms/ref:0.021066107388724657 - perf/total_num_tokens:1545284 - perf/time_per_step:118.46808672882617 - perf/throughput:3260.97104011049 - frontier/active_count:59.0 - frontier/completed_count:5.0 - frontier/blacklisted_count:766.0 - frontier/mean_score:2.664011661955449 - frontier/mean_frontier_pct:0.3562578623930798 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:4.179766025299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.1376489999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.9176456999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.339899999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.6021542999999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:4.62125597 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.2678699999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.8823509999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.1889929999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8623509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.1234456999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.1410562999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.4976456999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:4.185446392999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.9176456999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9596463929999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.4669437 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.5379299999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.0377299999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.103706392999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:2.822350009999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9596463929999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.4319299999999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.237 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.3978456999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.0108556069999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:1.8623509999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.3376456999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.9475403108999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9682350999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:2.9932177692715096 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.339899999999999 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:180.0 - cluster/prob_snapshot/cluster_0:0.026592785972354354 - cluster/prob_snapshot/cluster_1:0.01699457434781285 - cluster/prob_snapshot/cluster_2:0.007889846868150877 - cluster/prob_snapshot/cluster_3:0.021834853771641828 - cluster/prob_snapshot/cluster_4:0.01081585329881178 - cluster/prob_snapshot/cluster_5:0.01360029293432453 - cluster/prob_snapshot/cluster_6:0.007889846868150877 - cluster/prob_snapshot/cluster_7:0.018562839923005295 - cluster/prob_snapshot/cluster_8:0.014887067725817457 - cluster/prob_snapshot/cluster_9:0.016555599511571914 - cluster/prob_snapshot/cluster_10:0.029401662722222433 - cluster/prob_snapshot/cluster_11:0.014428793659280157 - cluster/prob_snapshot/cluster_12:0.01633693731410983 - cluster/prob_snapshot/cluster_13:0.017559219717288604 - cluster/prob_snapshot/cluster_14:0.018338285630402018 - cluster/prob_snapshot/cluster_15:0.0202892237993751 - cluster/prob_snapshot/cluster_16:0.011848773651114948 - cluster/prob_snapshot/cluster_17:0.015969289282363273 - cluster/prob_snapshot/cluster_18:0.011630859656210479 - cluster/prob_snapshot/cluster_19:0.02101043762187316 - cluster/prob_snapshot/cluster_20:0.019872194398826155 - cluster/prob_snapshot/cluster_21:0.01998423773182854 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01589068793153415 - cluster/prob_snapshot/cluster_24:0.026628925986311124 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.018287875483082913 - cluster/prob_snapshot/cluster_28:0.018562839923005295 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01308718249156225 - cluster/prob_snapshot/cluster_31:0.018830059531203196 - cluster/prob_snapshot/cluster_32:0.01569535362095761 - cluster/prob_snapshot/cluster_33:0.016146987389796103 - cluster/prob_snapshot/cluster_34:0.018270919342621032 - cluster/prob_snapshot/cluster_35:0.019326848259646767 - cluster/prob_snapshot/cluster_36:0.014232390487907032 - cluster/prob_snapshot/cluster_37:0.016467779236372215 - cluster/prob_snapshot/cluster_38:0.01893404191690051 - cluster/prob_snapshot/cluster_39:0.018562839923005295 - cluster/prob_snapshot/cluster_40:0.02455644057495519 - cluster/prob_snapshot/cluster_41:0.019746607664277795 - cluster/prob_snapshot/cluster_42:0.017956543332976443 - cluster/prob_snapshot/cluster_43:0.018830059531203196 - cluster/prob_snapshot/cluster_44:0.021834853771641828 - cluster/prob_snapshot/cluster_45:0.018270919342621032 - cluster/prob_snapshot/cluster_46:0.014633213286627702 - cluster/prob_snapshot/cluster_47:0.018017493783825906 - cluster/prob_snapshot/cluster_48:0.011871353335442883 - cluster/prob_snapshot/cluster_49:0.022380199910821217 - cluster/prob_snapshot/cluster_50:0.008195581138255047 - cluster/prob_snapshot/cluster_51:0.014232390487907032 - cluster/prob_snapshot/cluster_52:0.008893467073867565 - cluster/prob_snapshot/cluster_53:0.0191558662054217 - cluster/prob_snapshot/cluster_54:0.011848773651114948 - cluster/prob_snapshot/cluster_55:0.02123499191447644 - cluster/prob_snapshot/cluster_56:0.01875303740884027 - cluster/prob_snapshot/cluster_57:0.018338285630402018 - cluster/prob_snapshot/cluster_58:0.012522436528924782 - cluster/prob_snapshot/cluster_59:0.019043649578727822 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.015034036085348371 - cluster/prob_snapshot/cluster_62:0.014887067725817457 - cluster/prob_snapshot/cluster_63:0.008544524106061307
[36m(TaskRunner pid=2823680)[0m Training Progress:  22%|██▎       | 180/800 [5:43:13<21:35:09, 125.34s/it]
[36m(TaskRunner pid=2823680)[0m 
[36m(TaskRunner pid=2823680)[0m Training Progress:  23%|██▎       | 181/800 [5:45:10<21:08:53, 122.99s/it]
[36m(TaskRunner pid=2823680)[0m step:181 - global_seqlen/min:259166 - global_seqlen/max:427503 - global_seqlen/minmax_diff:168337 - global_seqlen/balanced_min:350183 - global_seqlen/balanced_max:350591 - global_seqlen/mean:350380.75 - frontier/skipped_zero_acc_count:20.0 - actor/entropy:np.float64(0.19110998383688707) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013681883923709393 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05151680327253416) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005559039157190084) - actor/ppo_kl:np.float64(1.4268564247762091e-05) - actor/pg_clipfrac_lower:np.float64(1.3602223218724788e-06) - actor/grad_norm:np.float64(0.22497656302792685) - perf/mfu/actor:np.float64(0.14990453947603702) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(163.23694610595703) - actor/lr:np.float64(1e-06) - training/global_step:181 - training/epoch:0 - critic/score/mean:0.6574074029922485 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6471667885780334 - critic/rewards/max:1.0095914602279663 - critic/rewards/min:-0.059717971831560135 - critic/advantages/mean:-0.11386466026306152 - critic/advantages/max:2.474842071533203 - critic/advantages/min:-2.4748497009277344 - critic/returns/mean:-0.11386466026306152 - critic/returns/max:2.474842071533203 - critic/returns/min:-2.4748497009277344 - response_length/mean:1060.89697265625 - response_length/max:8192.0 - response_length/min:150.0 - response_length/clip_ratio:0.013888888992369175 - response_length_non_aborted/mean:1060.89697265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:150.0 - response_length_non_aborted/clip_ratio:0.013888888992369175 - response/aborted_ratio:0.0 - prompt_length/mean:238.26852416992188 - prompt_length/max:461.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.653849363327026e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2278497973456979) - timing_s/agent_loop/generate_sequences/max:np.float64(27.51215756405145) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.060549777183041) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.51215756405145) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:225 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.19728185608983 - timing_s/reward:0.0001279367133975029 - timing_s/old_log_prob:11.290595885366201 - timing_s/ref:17.83371669985354 - timing_s/adv:0.10373203177005053 - timing_s/update_actor:27.765491452999413 - timing_s/update_weights:30.63559359870851 - timing_s/step:117.29002892691642 - timing_s/stop_profile:6.23827800154686e-05 - timing_per_token_ms/adv:9.241333848566479e-05 - timing_per_token_ms/update_actor:0.024735867177024615 - timing_per_token_ms/gen:0.031853375578721525 - timing_per_token_ms/ref:0.015887795406286924 - perf/total_num_tokens:1401523 - perf/time_per_step:117.29002892691642 - perf/throughput:2987.3021023664573 - frontier/active_count:59.0 - frontier/completed_count:5.0 - frontier/blacklisted_count:786.0 - frontier/mean_score:2.680536650586271 - frontier/mean_frontier_pct:0.3796070011991153 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:4.179766025299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.4319299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.1376489999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.9176456999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.339899999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.6021542999999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:4.62125597 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.4875089999999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:2.9176456999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.1889929999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:1.8623509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.211645699999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.4864119899999992 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.1410562999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.6483519899999997 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:4.429812475099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9423519899999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9596463929999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.4669437 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.5379299999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:3.0264109999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9759899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:4.201789999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.103706392999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:2.822350009999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9596463929999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.237 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.3978456999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.0075989248999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:2.203645699999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.2363519899999993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.9475403108999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:1.9682350999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:2.9952524384900565 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.5379299999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:181.0 - cluster/prob_snapshot/cluster_0:0.026428846603810775 - cluster/prob_snapshot/cluster_1:0.01688980609261222 - cluster/prob_snapshot/cluster_2:0.007841207492238367 - cluster/prob_snapshot/cluster_3:0.02170024613243901 - cluster/prob_snapshot/cluster_4:0.01321516301812611 - cluster/prob_snapshot/cluster_5:0.013516449765805865 - cluster/prob_snapshot/cluster_6:0.007841207492238367 - cluster/prob_snapshot/cluster_7:0.018448403614657733 - cluster/prob_snapshot/cluster_8:0.014795291840245587 - cluster/prob_snapshot/cluster_9:0.01645353745110901 - cluster/prob_snapshot/cluster_10:0.029220407173224165 - cluster/prob_snapshot/cluster_11:0.015728630116773137 - cluster/prob_snapshot/cluster_12:0.01623622326369046 - cluster/prob_snapshot/cluster_13:0.017450970532883373 - cluster/prob_snapshot/cluster_14:0.018448403614657733 - cluster/prob_snapshot/cluster_15:0.020164144669216764 - cluster/prob_snapshot/cluster_16:0.011775728259315872 - cluster/prob_snapshot/cluster_17:0.015870841710763892 - cluster/prob_snapshot/cluster_18:0.011559157661931264 - cluster/prob_snapshot/cluster_19:0.020307378699504178 - cluster/prob_snapshot/cluster_20:0.0157216936737193 - cluster/prob_snapshot/cluster_21:0.019861038781632548 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.016745647501066357 - cluster/prob_snapshot/cluster_24:0.02800990143453359 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0181751342771853 - cluster/prob_snapshot/cluster_28:0.018604622585912806 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013006502549418852 - cluster/prob_snapshot/cluster_31:0.018713975865791353 - cluster/prob_snapshot/cluster_32:0.01559859480958813 - cluster/prob_snapshot/cluster_33:0.016047444343824303 - cluster/prob_snapshot/cluster_34:0.018158282668041172 - cluster/prob_snapshot/cluster_35:0.019136131447296676 - cluster/prob_snapshot/cluster_36:0.014144650560549334 - cluster/prob_snapshot/cluster_37:0.016366258570875467 - cluster/prob_snapshot/cluster_38:0.01881731722024551 - cluster/prob_snapshot/cluster_39:0.018448403614657733 - cluster/prob_snapshot/cluster_40:0.026568105176044066 - cluster/prob_snapshot/cluster_41:0.01962487365736611 - cluster/prob_snapshot/cluster_42:0.017845844725531027 - cluster/prob_snapshot/cluster_43:0.018713975865791353 - cluster/prob_snapshot/cluster_44:0.0208809123483597 - cluster/prob_snapshot/cluster_45:0.018158282668041172 - cluster/prob_snapshot/cluster_46:0.014543002364445001 - cluster/prob_snapshot/cluster_47:0.01790641942867075 - cluster/prob_snapshot/cluster_48:0.011798168744268665 - cluster/prob_snapshot/cluster_49:0.022242230318425996 - cluster/prob_snapshot/cluster_50:0.00814505696985652 - cluster/prob_snapshot/cluster_51:0.014144650560549334 - cluster/prob_snapshot/cluster_52:0.008838640574012728 - cluster/prob_snapshot/cluster_53:0.019017181859183885 - cluster/prob_snapshot/cluster_54:0.0139337498371735 - cluster/prob_snapshot/cluster_55:0.02046359767075925 - cluster/prob_snapshot/cluster_56:0.01863742856987637 - cluster/prob_snapshot/cluster_57:0.018225233655721915 - cluster/prob_snapshot/cluster_58:0.012445238136123322 - cluster/prob_snapshot/cluster_59:0.01893911447611763 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014941354168340667 - cluster/prob_snapshot/cluster_62:0.016047444343824303 - cluster/prob_snapshot/cluster_63:0.008491848771934625
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 17:17:31,321:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  23%|██▎       | 182/800 [5:47:04<20:40:02, 120.39s/it]
[36m(TaskRunner pid=2823680)[0m step:182 - global_seqlen/min:261755 - global_seqlen/max:398097 - global_seqlen/minmax_diff:136342 - global_seqlen/balanced_min:328389 - global_seqlen/balanced_max:328539 - global_seqlen/mean:328441.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.21117105121569088) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014867319725453854 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07351323071634397) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005149743145693719) - actor/ppo_kl:np.float64(9.014950899199903e-05) - actor/pg_clipfrac_lower:np.float64(3.426447240902538e-06) - actor/grad_norm:np.float64(0.2502482694884141) - perf/mfu/actor:np.float64(0.17751332474510795) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(163.00601196289062) - actor/lr:np.float64(1e-06) - training/global_step:182 - training/epoch:0 - critic/score/mean:0.66015625 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6488737463951111 - critic/rewards/max:1.0103355646133423 - critic/rewards/min:-0.09031077474355698 - critic/advantages/mean:-0.183308944106102 - critic/advantages/max:2.4748306274414062 - critic/advantages/min:-2.474843740463257 - critic/returns/mean:-0.183308944106102 - critic/returns/max:2.4748306274414062 - critic/returns/min:-2.474843740463257 - response_length/mean:1000.48046875 - response_length/max:8192.0 - response_length/min:161.0 - response_length/clip_ratio:0.009114583022892475 - response_length_non_aborted/mean:1000.48046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:161.0 - response_length_non_aborted/clip_ratio:0.009114583022892475 - response/aborted_ratio:0.0 - prompt_length/mean:232.1979217529297 - prompt_length/max:641.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.000108327716588974 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9919456476345658) - timing_s/agent_loop/generate_sequences/max:np.float64(27.319466887041926) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.74138292272346) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.319466887041926) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:241 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.421740654855967 - timing_s/reward:0.00017934106290340424 - timing_s/old_log_prob:10.403625156730413 - timing_s/ref:20.810438621789217 - timing_s/adv:0.08879675343632698 - timing_s/update_actor:21.936823040246964 - timing_s/update_weights:30.91649655252695 - timing_s/step:114.05071423575282 - timing_s/stop_profile:5.2838586270809174e-05 - timing_per_token_ms/adv:9.379638198528883e-05 - timing_per_token_ms/update_actor:0.023171957912877048 - timing_per_token_ms/gen:0.038291160438351846 - timing_per_token_ms/ref:0.021982153341342812 - perf/total_num_tokens:1313765 - perf/time_per_step:114.05071423575282 - perf/throughput:2879.782491507095 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:818.0 - frontier/mean_score:2.725208552035224 - frontier/mean_frontier_pct:0.38991627822640357 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:4.179766025299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.3023509999999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.1376489999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.9176456999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.339899999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.7215080099999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.734879179 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.4875089999999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9423519899999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.1889929999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.203645699999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.211645699999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.1410562999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.7538463929999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:4.429812475099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9423519899999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9596463929999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.5379299999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:3.0264109999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.9831929999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.3423519899999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.103706392999999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:2.822350009999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9596463929999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.4659 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.3978456999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.0075989248999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:2.203645699999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.2363519899999993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.9475403108999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.2777645699999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:3.5966767069430396 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.5379299999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:182.0 - cluster/prob_snapshot/cluster_0:0.026443822589970422 - cluster/prob_snapshot/cluster_1:0.016899376752507973 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.020892744580731305 - cluster/prob_snapshot/cluster_4:0.013222651430368375 - cluster/prob_snapshot/cluster_5:0.013524108903098337 - cluster/prob_snapshot/cluster_6:0.007845650736267858 - cluster/prob_snapshot/cluster_7:0.018458857458570878 - cluster/prob_snapshot/cluster_8:0.014803675637281796 - cluster/prob_snapshot/cluster_9:0.017217967359418895 - cluster/prob_snapshot/cluster_10:0.029955816721926216 - cluster/prob_snapshot/cluster_11:0.01573754279277713 - cluster/prob_snapshot/cluster_12:0.016245423568892085 - cluster/prob_snapshot/cluster_13:0.017460859178312763 - cluster/prob_snapshot/cluster_14:0.018615164951780255 - cluster/prob_snapshot/cluster_15:0.02017557074300705 - cluster/prob_snapshot/cluster_16:0.013941645438818234 - cluster/prob_snapshot/cluster_17:0.015879834971399342 - cluster/prob_snapshot/cluster_18:0.011565707693711208 - cluster/prob_snapshot/cluster_19:0.02031888593729255 - cluster/prob_snapshot/cluster_20:0.016705386424197385 - cluster/prob_snapshot/cluster_21:0.019872293099551477 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.017422560261921644 - cluster/prob_snapshot/cluster_24:0.028025773330212767 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01818543327219879 - cluster/prob_snapshot/cluster_28:0.018615164951780255 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013013872723573083 - cluster/prob_snapshot/cluster_31:0.018724580197026816 - cluster/prob_snapshot/cluster_32:0.016619168391264025 - cluster/prob_snapshot/cluster_33:0.016056537676877897 - cluster/prob_snapshot/cluster_34:0.01816857211403918 - cluster/prob_snapshot/cluster_35:0.019146974994273962 - cluster/prob_snapshot/cluster_36:0.014152665669729215 - cluster/prob_snapshot/cluster_37:0.016375532560978664 - cluster/prob_snapshot/cluster_38:0.018873550807901877 - cluster/prob_snapshot/cluster_39:0.014819188464593161 - cluster/prob_snapshot/cluster_40:0.024302176782228135 - cluster/prob_snapshot/cluster_41:0.019635994151600437 - cluster/prob_snapshot/cluster_42:0.017855957127620427 - cluster/prob_snapshot/cluster_43:0.018724580197026816 - cluster/prob_snapshot/cluster_44:0.020892744580731305 - cluster/prob_snapshot/cluster_45:0.01816857211403918 - cluster/prob_snapshot/cluster_46:0.014551243200883858 - cluster/prob_snapshot/cluster_47:0.01823556103970034 - cluster/prob_snapshot/cluster_48:0.011804854212403997 - cluster/prob_snapshot/cluster_49:0.022254833945757974 - cluster/prob_snapshot/cluster_50:0.008149672391114917 - cluster/prob_snapshot/cluster_51:0.01560083069959109 - cluster/prob_snapshot/cluster_52:0.008843649016525972 - cluster/prob_snapshot/cluster_53:0.019027958003015967 - cluster/prob_snapshot/cluster_54:0.013941645438818234 - cluster/prob_snapshot/cluster_55:0.02047519343050193 - cluster/prob_snapshot/cluster_56:0.018647989525354222 - cluster/prob_snapshot/cluster_57:0.01823556103970034 - cluster/prob_snapshot/cluster_58:0.014410567918446365 - cluster/prob_snapshot/cluster_59:0.022754833685948807 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014949820732038502 - cluster/prob_snapshot/cluster_62:0.016056537676877897 - cluster/prob_snapshot/cluster_63:0.008496660703820444
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 17:19:29,712:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  23%|██▎       | 183/800 [5:49:03<20:33:30, 119.95s/it]
[36m(TaskRunner pid=2823680)[0m step:183 - global_seqlen/min:314724 - global_seqlen/max:462016 - global_seqlen/minmax_diff:147292 - global_seqlen/balanced_min:361529 - global_seqlen/balanced_max:361579 - global_seqlen/mean:361560.25 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.22822565410514267) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013783790171146393 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.022760186262530624) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006602576751753745) - actor/ppo_kl:np.float64(5.8897486199811635e-05) - actor/pg_clipfrac_lower:np.float64(1.9995456327427636e-07) - actor/grad_norm:np.float64(0.22933782293246344) - perf/mfu/actor:np.float64(0.1818720786359466) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(161.84737396240234) - actor/lr:np.float64(1e-06) - training/global_step:183 - training/epoch:0 - critic/score/mean:0.5811855792999268 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5694974660873413 - critic/rewards/max:1.0045963525772095 - critic/rewards/min:-0.0787607878446579 - critic/advantages/mean:-0.15701189637184143 - critic/advantages/max:2.4748435020446777 - critic/advantages/min:-2.474839210510254 - critic/returns/mean:-0.15701189637184143 - critic/returns/max:2.4748435020446777 - critic/returns/min:-2.474839210510254 - response_length/mean:1154.5308837890625 - response_length/max:8192.0 - response_length/min:154.0 - response_length/clip_ratio:0.011597937904298306 - response_length_non_aborted/mean:1154.5308837890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:154.0 - response_length_non_aborted/clip_ratio:0.011597937904298306 - response/aborted_ratio:0.0 - prompt_length/mean:235.38143920898438 - prompt_length/max:371.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.471217006444931e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3048064159229398) - timing_s/agent_loop/generate_sequences/max:np.float64(28.18102397583425) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.6835515252732876) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.18102397583425) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:216 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.472834541462362 - timing_s/reward:0.00012886803597211838 - timing_s/old_log_prob:11.564863719977438 - timing_s/ref:21.36484988965094 - timing_s/adv:0.11386243999004364 - timing_s/update_actor:23.41411177907139 - timing_s/update_weights:30.999924769625068 - timing_s/step:118.35190903022885 - timing_s/stop_profile:8.822605013847351e-05 - timing_per_token_ms/adv:0.0001055677692263879 - timing_per_token_ms/update_actor:0.021708436505927645 - timing_per_token_ms/gen:0.034013048702626544 - timing_per_token_ms/ref:0.01980845960181698 - perf/total_num_tokens:1446241 - perf/time_per_step:118.35190903022885 - perf/throughput:3054.9591718681286 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:848.0 - frontier/mean_score:2.702948230816948 - frontier/mean_frontier_pct:0.4033358266537833 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.825836217709999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.211645699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.7963542999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.9176456999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.5379299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.7215080099999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.734879179 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.4875089999999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9423519899999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:3.1322950999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.203645699999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.1481519899999992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.0987394099999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.827692475099999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:4.429812475099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9596463929999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9596463929999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.5379299999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:3.018487699999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:1.8659000000000001 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.3882350999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.5396463929999995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.072594475099999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:2.822350009999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9596463929999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.62613 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.3978456999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.0075989248999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:2.203645699999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.2363519899999993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9632782176299997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.2777645699999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:3.5966767069430396 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:2.5379299999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.343 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:183.0 - cluster/prob_snapshot/cluster_0:0.024403979571912225 - cluster/prob_snapshot/cluster_1:0.01703855276432001 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02048622355871083 - cluster/prob_snapshot/cluster_4:0.013331547510893136 - cluster/prob_snapshot/cluster_5:0.011458460620501043 - cluster/prob_snapshot/cluster_6:0.007910264147492144 - cluster/prob_snapshot/cluster_7:0.018610876683972755 - cluster/prob_snapshot/cluster_8:0.01618877242790479 - cluster/prob_snapshot/cluster_9:0.017359767146694367 - cluster/prob_snapshot/cluster_10:0.030202519968027364 - cluster/prob_snapshot/cluster_11:0.015867150438887208 - cluster/prob_snapshot/cluster_12:0.016379213903034442 - cluster/prob_snapshot/cluster_13:0.01760465931833204 - cluster/prob_snapshot/cluster_14:0.01876847145859137 - cluster/prob_snapshot/cluster_15:0.019980067437219026 - cluster/prob_snapshot/cluster_16:0.014056462845323135 - cluster/prob_snapshot/cluster_17:0.016010614474804674 - cluster/prob_snapshot/cluster_18:0.011660957896968298 - cluster/prob_snapshot/cluster_19:0.02008121427090802 - cluster/prob_snapshot/cluster_20:0.016842964814948017 - cluster/prob_snapshot/cluster_21:0.019766024721670793 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01803708927176586 - cluster/prob_snapshot/cluster_24:0.028256581567532423 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.018335200693386254 - cluster/prob_snapshot/cluster_28:0.018878787800824397 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013121049392300084 - cluster/prob_snapshot/cluster_31:0.018878787800824397 - cluster/prob_snapshot/cluster_32:0.01675603672730994 - cluster/prob_snapshot/cluster_33:0.01618877242790479 - cluster/prob_snapshot/cluster_34:0.018318200673966757 - cluster/prob_snapshot/cluster_35:0.019254120662007913 - cluster/prob_snapshot/cluster_36:0.011902073923720337 - cluster/prob_snapshot/cluster_37:0.01651039441692237 - cluster/prob_snapshot/cluster_38:0.015233908948723738 - cluster/prob_snapshot/cluster_39:0.01619972083691286 - cluster/prob_snapshot/cluster_40:0.024502319076966882 - cluster/prob_snapshot/cluster_41:0.019599253218422676 - cluster/prob_snapshot/cluster_42:0.018003011124729525 - cluster/prob_snapshot/cluster_43:0.018878787800824397 - cluster/prob_snapshot/cluster_44:0.021064808255571988 - cluster/prob_snapshot/cluster_45:0.018318200673966757 - cluster/prob_snapshot/cluster_46:0.014671080992848905 - cluster/prob_snapshot/cluster_47:0.01838574129166045 - cluster/prob_snapshot/cluster_48:0.011902073923720337 - cluster/prob_snapshot/cluster_49:0.022438115203846382 - cluster/prob_snapshot/cluster_50:0.008216789594168081 - cluster/prob_snapshot/cluster_51:0.016751376490326216 - cluster/prob_snapshot/cluster_52:0.008916481513132858 - cluster/prob_snapshot/cluster_53:0.019184664096179648 - cluster/prob_snapshot/cluster_54:0.014056462845323135 - cluster/prob_snapshot/cluster_55:0.020643818333329445 - cluster/prob_snapshot/cluster_56:0.018901954232693335 - cluster/prob_snapshot/cluster_57:0.01838574129166045 - cluster/prob_snapshot/cluster_58:0.01452924716917898 - cluster/prob_snapshot/cluster_59:0.022942232727241055 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.015072941037435634 - cluster/prob_snapshot/cluster_62:0.01618877242790479 - cluster/prob_snapshot/cluster_63:0.00856663555365047
[36m(TaskRunner pid=2823680)[0m Training Progress:  23%|██▎       | 184/800 [5:50:55<20:07:16, 117.59s/it]
[36m(TaskRunner pid=2823680)[0m step:184 - global_seqlen/min:283111 - global_seqlen/max:438001 - global_seqlen/minmax_diff:154890 - global_seqlen/balanced_min:364425 - global_seqlen/balanced_max:364609 - global_seqlen/mean:364513.75 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.208479624127551) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.015037013217806816 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.1125167570571648) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006079173364589818) - actor/ppo_kl:np.float64(2.810699110879647e-06) - actor/pg_clipfrac_lower:np.float64(4.451350016134544e-06) - actor/grad_norm:np.float64(0.2558531623620253) - perf/mfu/actor:np.float64(0.20131576137514107) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(162.02349090576172) - actor/lr:np.float64(1e-06) - training/global_step:184 - training/epoch:0 - critic/score/mean:0.6159793734550476 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6039682030677795 - critic/rewards/max:1.0466846227645874 - critic/rewards/min:-0.09752456098794937 - critic/advantages/mean:-0.13550709187984467 - critic/advantages/max:2.4747314453125 - critic/advantages/min:-2.474838972091675 - critic/returns/mean:-0.13550709187984467 - critic/returns/max:2.4747314453125 - critic/returns/min:-2.474838972091675 - response_length/mean:1099.2822265625 - response_length/max:8192.0 - response_length/min:144.0 - response_length/clip_ratio:0.007731958758085966 - response_length_non_aborted/mean:1099.2822265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:144.0 - response_length_non_aborted/clip_ratio:0.007731958758085966 - response/aborted_ratio:0.0 - prompt_length/mean:228.5154571533203 - prompt_length/max:371.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.123794734477997e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4150906931608915) - timing_s/agent_loop/generate_sequences/max:np.float64(29.345601012930274) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.081697653491574) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.345601012930274) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:218 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.069253670051694 - timing_s/reward:0.00016707181930541992 - timing_s/old_log_prob:10.140836130827665 - timing_s/ref:21.01814148016274 - timing_s/adv:0.1314074331894517 - timing_s/update_actor:21.42027312517166 - timing_s/update_weights:27.583003694191575 - timing_s/step:111.84963317215443 - timing_s/stop_profile:5.6842342019081116e-05 - timing_per_token_ms/adv:0.0001275340951846002 - timing_per_token_ms/update_actor:0.020788893636536413 - timing_per_token_ms/gen:0.03642167354992854 - timing_per_token_ms/ref:0.020398615139753294 - perf/total_num_tokens:1458055 - perf/time_per_step:111.84963317215443 - perf/throughput:3258.962409281711 - frontier/active_count:57.0 - frontier/completed_count:7.0 - frontier/blacklisted_count:879.0 - frontier/mean_score:2.703573068672984 - frontier/mean_frontier_pct:0.4180517950203867 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.1481519899999992 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.7963542999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.5379299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.7215080099999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:4.8144154253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.4875089999999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.5677856999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9423519899999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:3.1322950999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.203645699999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.6569999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.103706392999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:3.069117586999999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.827692475099999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:4.429812475099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9596463929999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9596463929999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.676550999999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:3.018487699999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:1.8659000000000001 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.3882350999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.5396463929999995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.050816132569999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:2.822350009999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9596463929999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.2881543 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.7382909999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.3978456999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.0075989248999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.4425519899999992 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.1654463929999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9632782176299997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.2777645699999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:4.017673694860127 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.2765509999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:184.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.017333467782122088 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.020428793771708927 - cluster/prob_snapshot/cluster_4:0.013562299126120548 - cluster/prob_snapshot/cluster_5:0.011656791556503773 - cluster/prob_snapshot/cluster_6:0.008047180452776121 - cluster/prob_snapshot/cluster_7:0.019093329101778016 - cluster/prob_snapshot/cluster_8:0.016468978861796706 - cluster/prob_snapshot/cluster_9:0.017660241964475153 - cluster/prob_snapshot/cluster_10:0.031241407710683004 - cluster/prob_snapshot/cluster_11:0.01614179001766363 - cluster/prob_snapshot/cluster_12:0.016662716629270255 - cluster/prob_snapshot/cluster_13:0.017909372898650765 - cluster/prob_snapshot/cluster_14:0.019093329101778016 - cluster/prob_snapshot/cluster_15:0.020325896218890752 - cluster/prob_snapshot/cluster_16:0.014299761794923108 - cluster/prob_snapshot/cluster_17:0.017241640563685306 - cluster/prob_snapshot/cluster_18:0.011862793795435872 - cluster/prob_snapshot/cluster_19:0.020140380144267295 - cluster/prob_snapshot/cluster_20:0.01713449446168198 - cluster/prob_snapshot/cluster_21:0.019915928597192007 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01834928764784037 - cluster/prob_snapshot/cluster_24:0.028745665961687383 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.018652558987467316 - cluster/prob_snapshot/cluster_28:0.01920555487531566 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013348157560971274 - cluster/prob_snapshot/cluster_31:0.01920555487531566 - cluster/prob_snapshot/cluster_32:0.017046061762773925 - cluster/prob_snapshot/cluster_33:0.01736850970732874 - cluster/prob_snapshot/cluster_34:0.01863526471999171 - cluster/prob_snapshot/cluster_35:0.019587384256418955 - cluster/prob_snapshot/cluster_36:0.012108083224606858 - cluster/prob_snapshot/cluster_37:0.01679616770592978 - cluster/prob_snapshot/cluster_38:0.015497587947225078 - cluster/prob_snapshot/cluster_39:0.016480116773415834 - cluster/prob_snapshot/cluster_40:0.024926422107707142 - cluster/prob_snapshot/cluster_41:0.019797167927611756 - cluster/prob_snapshot/cluster_42:0.018314619652741296 - cluster/prob_snapshot/cluster_43:0.01920555487531566 - cluster/prob_snapshot/cluster_44:0.021429412479159478 - cluster/prob_snapshot/cluster_45:0.01863526471999171 - cluster/prob_snapshot/cluster_46:0.014925018177070459 - cluster/prob_snapshot/cluster_47:0.018703974377259656 - cluster/prob_snapshot/cluster_48:0.012108083224606858 - cluster/prob_snapshot/cluster_49:0.022826489570866842 - cluster/prob_snapshot/cluster_50:0.008359011453204989 - cluster/prob_snapshot/cluster_51:0.017769149108308017 - cluster/prob_snapshot/cluster_52:0.009070814122278165 - cluster/prob_snapshot/cluster_53:0.019516725488465246 - cluster/prob_snapshot/cluster_54:0.015850057760517223 - cluster/prob_snapshot/cluster_55:0.02054101954524657 - cluster/prob_snapshot/cluster_56:0.019229122287758565 - cluster/prob_snapshot/cluster_57:0.018703974377259656 - cluster/prob_snapshot/cluster_58:0.014780729395798728 - cluster/prob_snapshot/cluster_59:0.02607124040231532 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.012680425226005819 - cluster/prob_snapshot/cluster_62:0.02126199271004278 - cluster/prob_snapshot/cluster_63:0.008047180452776121
[36m(TaskRunner pid=2823680)[0m Training Progress:  23%|██▎       | 185/800 [5:52:54<20:09:45, 118.02s/it]
[36m(TaskRunner pid=2823680)[0m step:185 - global_seqlen/min:364174 - global_seqlen/max:444575 - global_seqlen/minmax_diff:80401 - global_seqlen/balanced_min:401352 - global_seqlen/balanced_max:401515 - global_seqlen/mean:401416.25 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.21841757663307673) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014272413216531277 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04648032332625007) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005280908023857418) - actor/ppo_kl:np.float64(0.00010004437616586567) - actor/pg_clipfrac_lower:np.float64(5.545468043914916e-07) - actor/grad_norm:np.float64(0.2499233471850554) - perf/mfu/actor:np.float64(0.2206329629269571) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(103.96846008300781) - actor/lr:np.float64(1e-06) - training/global_step:185 - training/epoch:0 - critic/score/mean:0.6061828136444092 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5941444635391235 - critic/rewards/max:1.0011197328567505 - critic/rewards/min:-0.08068132400512695 - critic/advantages/mean:-0.16005586087703705 - critic/advantages/max:2.474773406982422 - critic/advantages/min:-2.4748384952545166 - critic/returns/mean:-0.16005586087703705 - critic/returns/max:2.474773406982422 - critic/returns/min:-2.4748384952545166 - response_length/mean:1175.356201171875 - response_length/max:8192.0 - response_length/min:84.0 - response_length/clip_ratio:0.014784946106374264 - response_length_non_aborted/mean:1175.356201171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:84.0 - response_length_non_aborted/clip_ratio:0.014784946106374264 - response/aborted_ratio:0.0 - prompt_length/mean:231.26881408691406 - prompt_length/max:348.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.529106318950653e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.750281679444015) - timing_s/agent_loop/generate_sequences/max:np.float64(29.19989851396531) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.059388769797806) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.19989851396531) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.76470174547285 - timing_s/reward:0.00014223065227270126 - timing_s/old_log_prob:10.002412807196379 - timing_s/ref:27.612982232123613 - timing_s/adv:0.0794453090056777 - timing_s/update_actor:21.516557698138058 - timing_s/update_weights:28.363423037342727 - timing_s/step:118.75967006478459 - timing_s/stop_profile:6.669294089078903e-05 - timing_per_token_ms/adv:7.59131462249758e-05 - timing_per_token_ms/update_actor:0.020559924950133306 - timing_per_token_ms/gen:0.0351811699101426 - timing_per_token_ms/ref:0.026385300581372912 - perf/total_num_tokens:1605665 - perf/time_per_step:118.75967006478459 - perf/throughput:3380.0721219671914 - frontier/active_count:56.0 - frontier/completed_count:8.0 - frontier/blacklisted_count:914.0 - frontier/mean_score:2.659832603898216 - frontier/mean_frontier_pct:0.4255706486807469 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.103706392999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.7963542999999997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:1.2401 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.676550999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.7215080099999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.87009079771 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.6412562999999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.6974499899999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9596463929999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.492606569999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.203645699999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.6569999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.672594475099999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:3.069117586999999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.827692475099999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.9596463929999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9596463929999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.773585699999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:3.018487699999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.3882350999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.5396463929999995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.050816132569999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:2.8756450069999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9596463929999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.8717625899999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:1.8017080099999998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.3978456999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.405319247429999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.4425519899999992 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.1654463929999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9632782176299997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.8944351989999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:4.017673694860127 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.2765509999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:185.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.01793312982893034 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.020837149061636754 - cluster/prob_snapshot/cluster_4:0.014031495259036516 - cluster/prob_snapshot/cluster_5:0.012060065475598018 - cluster/prob_snapshot/cluster_6:0.008325577641498175 - cluster/prob_snapshot/cluster_7:0.019753874640240025 - cluster/prob_snapshot/cluster_8:0.017969384051229394 - cluster/prob_snapshot/cluster_9:0.01827120896638512 - cluster/prob_snapshot/cluster_10:0.032696007626385276 - cluster/prob_snapshot/cluster_11:0.017732428349928383 - cluster/prob_snapshot/cluster_12:0.018109692223049326 - cluster/prob_snapshot/cluster_13:0.018528958739432954 - cluster/prob_snapshot/cluster_14:0.019869982933877522 - cluster/prob_snapshot/cluster_15:0.016734448454353235 - cluster/prob_snapshot/cluster_16:0.014794470905333108 - cluster/prob_snapshot/cluster_17:0.017838125791033502 - cluster/prob_snapshot/cluster_18:0.01227319448949505 - cluster/prob_snapshot/cluster_19:0.0246564554859949 - cluster/prob_snapshot/cluster_20:0.017727272903311216 - cluster/prob_snapshot/cluster_21:0.02060493247436176 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.018984092611664533 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.019297855811001543 - cluster/prob_snapshot/cluster_28:0.019869982933877522 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013809945333893832 - cluster/prob_snapshot/cluster_31:0.019869982933877522 - cluster/prob_snapshot/cluster_32:0.017635780820447304 - cluster/prob_snapshot/cluster_33:0.01862083952157008 - cluster/prob_snapshot/cluster_34:0.019279963237638 - cluster/prob_snapshot/cluster_35:0.02026502193876078 - cluster/prob_snapshot/cluster_36:0.01078296912937623 - cluster/prob_snapshot/cluster_37:0.017377241524029866 - cluster/prob_snapshot/cluster_38:0.01603373659479167 - cluster/prob_snapshot/cluster_39:0.017050256613879752 - cluster/prob_snapshot/cluster_40:0.025788767109215207 - cluster/prob_snapshot/cluster_41:0.020482063205908164 - cluster/prob_snapshot/cluster_42:0.01930602836478111 - cluster/prob_snapshot/cluster_43:0.019869982933877522 - cluster/prob_snapshot/cluster_44:0.022170776268026073 - cluster/prob_snapshot/cluster_45:0.019279963237638 - cluster/prob_snapshot/cluster_46:0.0154413584190354 - cluster/prob_snapshot/cluster_47:0.019351049948028304 - cluster/prob_snapshot/cluster_48:0.01252696985829485 - cluster/prob_snapshot/cluster_49:0.023616186106468987 - cluster/prob_snapshot/cluster_50:0.012096008325590006 - cluster/prob_snapshot/cluster_51:0.014882808902758182 - cluster/prob_snapshot/cluster_52:0.009384624551394535 - cluster/prob_snapshot/cluster_53:0.0161484333094657 - cluster/prob_snapshot/cluster_54:0.016398400319442678 - cluster/prob_snapshot/cluster_55:0.02125164883067643 - cluster/prob_snapshot/cluster_56:0.019894365675541398 - cluster/prob_snapshot/cluster_57:0.019351049948028304 - cluster/prob_snapshot/cluster_58:0.0127185447432155 - cluster/prob_snapshot/cluster_59:0.02697319110133282 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01311911238549438 - cluster/prob_snapshot/cluster_62:0.021997564508369066 - cluster/prob_snapshot/cluster_63:0.008325577641498175
[36m(TaskRunner pid=2823680)[0m Training Progress:  23%|██▎       | 186/800 [5:54:40<19:28:10, 114.15s/it]
[36m(TaskRunner pid=2823680)[0m step:186 - global_seqlen/min:357032 - global_seqlen/max:405133 - global_seqlen/minmax_diff:48101 - global_seqlen/balanced_min:380778 - global_seqlen/balanced_max:381268 - global_seqlen/mean:381012.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.21836290839645597) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014130293391644955 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.007868327302276157) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003222408854021018) - actor/ppo_kl:np.float64(3.817065355380893e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.25038639456033707) - perf/mfu/actor:np.float64(0.22753864660490886) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(103.95284271240234) - actor/lr:np.float64(1e-06) - training/global_step:186 - training/epoch:0 - critic/score/mean:0.6109550595283508 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6001532673835754 - critic/rewards/max:1.0041358470916748 - critic/rewards/min:-0.06294746696949005 - critic/advantages/mean:-0.1833730936050415 - critic/advantages/max:2.4748454093933105 - critic/advantages/min:-2.4748549461364746 - critic/returns/mean:-0.1833730936050415 - critic/returns/max:2.4748454093933105 - critic/returns/min:-2.4748549461364746 - response_length/mean:1075.334228515625 - response_length/max:8192.0 - response_length/min:143.0 - response_length/clip_ratio:0.014044944196939468 - response_length_non_aborted/mean:1075.334228515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:143.0 - response_length_non_aborted/clip_ratio:0.014044944196939468 - response/aborted_ratio:0.0 - prompt_length/mean:233.42697143554688 - prompt_length/max:393.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.991267532110214e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1427739057689905) - timing_s/agent_loop/generate_sequences/max:np.float64(28.011506251990795) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.645436535020053) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.011506251990795) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:216 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.96261991094798 - timing_s/reward:0.00013970304280519485 - timing_s/old_log_prob:9.599588827230036 - timing_s/ref:19.14089106209576 - timing_s/adv:0.06440744735300541 - timing_s/update_actor:19.68764887843281 - timing_s/update_weights:26.075944567099214 - timing_s/step:104.92233585938811 - timing_s/stop_profile:5.287863314151764e-05 - timing_per_token_ms/adv:6.911871736611451e-05 - timing_per_token_ms/update_actor:0.021127759201098054 - timing_per_token_ms/gen:0.039134186013426685 - timing_per_token_ms/ref:0.02054100719448634 - perf/total_num_tokens:1524050 - perf/time_per_step:104.92233585938811 - perf/throughput:3631.3764545865115 - frontier/active_count:55.0 - frontier/completed_count:9.0 - frontier/blacklisted_count:953.0 - frontier/mean_score:2.6599459719914376 - frontier/mean_frontier_pct:0.44807641612609606 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.671150910899999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.103706392999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.676550999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.7215080099999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:4.9090635583969995 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.6412562999999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.6974499899999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.9596463929999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.492606569999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.203645699999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.7598999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.672594475099999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:3.069117586999999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.827692475099999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9717524750999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9596463929999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.773585699999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.612941389999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.3882350999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.5396463929999995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.435571292798999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9596463929999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:3.211645699999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.9102338129999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:112.0 - frontier/cluster_50/score:1.561195607 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.3978456999999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.5837234732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.0097863929999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.1654463929999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9632782176299997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.3176456999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.8944351989999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:4.312371586402088 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.2765509999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:186.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.01825840852016303 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02121510208157505 - cluster/prob_snapshot/cluster_4:0.014286004452771013 - cluster/prob_snapshot/cluster_5:0.010645793878382466 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0201121787706984 - cluster/prob_snapshot/cluster_8:0.018295320336874976 - cluster/prob_snapshot/cluster_9:0.018602619879957885 - cluster/prob_snapshot/cluster_10:0.0335554563895672 - cluster/prob_snapshot/cluster_11:0.018054066632875353 - cluster/prob_snapshot/cluster_12:0.01843817347763977 - cluster/prob_snapshot/cluster_13:0.01935739932532814 - cluster/prob_snapshot/cluster_14:0.020230393085658212 - cluster/prob_snapshot/cluster_15:0.017037984955993433 - cluster/prob_snapshot/cluster_16:0.01506281927393765 - cluster/prob_snapshot/cluster_17:0.018865044827369722 - cluster/prob_snapshot/cluster_18:0.012495810880435736 - cluster/prob_snapshot/cluster_19:0.025103684700718092 - cluster/prob_snapshot/cluster_20:0.01804881767458764 - cluster/prob_snapshot/cluster_21:0.020978673451655418 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.019328434110213243 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.019647888479123604 - cluster/prob_snapshot/cluster_28:0.020313143106130083 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01406043596141147 - cluster/prob_snapshot/cluster_31:0.020230393085658212 - cluster/prob_snapshot/cluster_32:0.01795566606964052 - cluster/prob_snapshot/cluster_33:0.01895859218198182 - cluster/prob_snapshot/cluster_34:0.019629671362699146 - cluster/prob_snapshot/cluster_35:0.024695931476143868 - cluster/prob_snapshot/cluster_36:0.010978555182645507 - cluster/prob_snapshot/cluster_37:0.017692437278150383 - cluster/prob_snapshot/cluster_38:0.016324563288451683 - cluster/prob_snapshot/cluster_39:0.017359521377445855 - cluster/prob_snapshot/cluster_40:0.026256534670918662 - cluster/prob_snapshot/cluster_41:0.01664812551864486 - cluster/prob_snapshot/cluster_42:0.01580996913782048 - cluster/prob_snapshot/cluster_43:0.020230393085658212 - cluster/prob_snapshot/cluster_44:0.021952911373647306 - cluster/prob_snapshot/cluster_45:0.019892637900058737 - cluster/prob_snapshot/cluster_46:0.015721440306877194 - cluster/prob_snapshot/cluster_47:0.01970204747389903 - cluster/prob_snapshot/cluster_48:0.01275418933417485 - cluster/prob_snapshot/cluster_49:0.024044546475344885 - cluster/prob_snapshot/cluster_50:0.010671410236004089 - cluster/prob_snapshot/cluster_51:0.015152759583310646 - cluster/prob_snapshot/cluster_52:0.009554846839467375 - cluster/prob_snapshot/cluster_53:0.017660806240611705 - cluster/prob_snapshot/cluster_54:0.013737711655271094 - cluster/prob_snapshot/cluster_55:0.021637120222682268 - cluster/prob_snapshot/cluster_56:0.020255218091799775 - cluster/prob_snapshot/cluster_57:0.015842055880452434 - cluster/prob_snapshot/cluster_58:0.012949239085358917 - cluster/prob_snapshot/cluster_59:0.029476822815953797 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013357072392899443 - cluster/prob_snapshot/cluster_62:0.022396565634321203 - cluster/prob_snapshot/cluster_63:0.008476590488938437
[36m(TaskRunner pid=2823680)[0m Training Progress:  23%|██▎       | 187/800 [5:56:30<19:15:56, 113.14s/it]
[36m(TaskRunner pid=2823680)[0m step:187 - global_seqlen/min:368783 - global_seqlen/max:446196 - global_seqlen/minmax_diff:77413 - global_seqlen/balanced_min:400179 - global_seqlen/balanced_max:400342 - global_seqlen/mean:400262.25 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.20503034035922313) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013238625600934029 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.019048121728701517) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004840133230146068) - actor/ppo_kl:np.float64(-3.575391674524892e-05) - actor/pg_clipfrac_lower:np.float64(5.285171380596787e-06) - actor/grad_norm:np.float64(0.24419615933528313) - perf/mfu/actor:np.float64(0.2264922407546546) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.28875732421875) - actor/lr:np.float64(1e-06) - training/global_step:187 - training/epoch:0 - critic/score/mean:0.5734536051750183 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5621384978294373 - critic/rewards/max:1.0092780590057373 - critic/rewards/min:-0.04931037127971649 - critic/advantages/mean:-0.15159378945827484 - critic/advantages/max:2.4747090339660645 - critic/advantages/min:-2.474858045578003 - critic/returns/mean:-0.15159378945827484 - critic/returns/max:2.4747090339660645 - critic/returns/min:-2.474858045578003 - response_length/mean:1174.4046630859375 - response_length/max:8192.0 - response_length/min:157.0 - response_length/clip_ratio:0.009020618163049221 - response_length_non_aborted/mean:1174.4046630859375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:157.0 - response_length_non_aborted/clip_ratio:0.009020618163049221 - response/aborted_ratio:0.0 - prompt_length/mean:233.4845428466797 - prompt_length/max:505.0 - prompt_length/min:169.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.931569755077362e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2487177159637213) - timing_s/agent_loop/generate_sequences/max:np.float64(29.120784039609134) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.1666039873180125) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.120784039609134) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:205 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.70503910072148 - timing_s/reward:0.00025299564003944397 - timing_s/old_log_prob:10.586237228475511 - timing_s/ref:20.688946161419153 - timing_s/adv:0.0929220961406827 - timing_s/update_actor:20.82362251728773 - timing_s/update_weights:27.24487833119929 - timing_s/step:110.54748572595417 - timing_s/stop_profile:6.460398435592651e-05 - timing_per_token_ms/adv:8.505283750870252e-05 - timing_per_token_ms/update_actor:0.019060140223526603 - timing_per_token_ms/gen:0.03369226247640445 - timing_per_token_ms/ref:0.01893686915359064 - perf/total_num_tokens:1601049 - perf/time_per_step:110.54748572595417 - perf/throughput:3620.7268520990615 - frontier/active_count:55.0 - frontier/completed_count:9.0 - frontier/blacklisted_count:984.0 - frontier/mean_score:2.648424149497988 - frontier/mean_frontier_pct:0.4621092955171601 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.7698056376299993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.103706392999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.676550999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.7215080099999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.936344490877899 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.7488794099999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.6974499899999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.9717524750999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.492606569999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.203645699999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:2.2319299999999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.470816132569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:3.069117586999999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.827692475099999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9717524750999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.9596463929999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.841509989999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.612941389999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.5396463929999995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.435571292798999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9717524750999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.548151989999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.9102338129999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.51 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5176456999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:112.0 - frontier/cluster_50/score:1.9928369248999998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.8784919899999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:2.5837234732009993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.0097863929999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.1654463929999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9632782176299997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5223519899999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.8944351989999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.918660110481462 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.1935856999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:187.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.019015119806965 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02130739720749385 - cluster/prob_snapshot/cluster_4:0.014348154923448701 - cluster/prob_snapshot/cluster_5:0.010692107814687504 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.020199675689874443 - cluster/prob_snapshot/cluster_8:0.01837491311412035 - cluster/prob_snapshot/cluster_9:0.01868354954683568 - cluster/prob_snapshot/cluster_10:0.03388872512470269 - cluster/prob_snapshot/cluster_11:0.018871458201224045 - cluster/prob_snapshot/cluster_12:0.018518387729557488 - cluster/prob_snapshot/cluster_13:0.019441612618355062 - cluster/prob_snapshot/cluster_14:0.02040151430951049 - cluster/prob_snapshot/cluster_15:0.017112107765342617 - cluster/prob_snapshot/cluster_16:0.015128349234445719 - cluster/prob_snapshot/cluster_17:0.015322525080522896 - cluster/prob_snapshot/cluster_18:0.012550173213184962 - cluster/prob_snapshot/cluster_19:0.023827659129626515 - cluster/prob_snapshot/cluster_20:0.018127338055661288 - cluster/prob_snapshot/cluster_21:0.02106994000792203 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.019412521391676988 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01973336552911594 - cluster/prob_snapshot/cluster_28:0.02040151430951049 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014121605108867931 - cluster/prob_snapshot/cluster_31:0.020318404289660353 - cluster/prob_snapshot/cluster_32:0.018033781199819073 - cluster/prob_snapshot/cluster_33:0.019507380647390986 - cluster/prob_snapshot/cluster_34:0.019715069160136024 - cluster/prob_snapshot/cluster_35:0.02480336975744502 - cluster/prob_snapshot/cluster_36:0.011026316778563956 - cluster/prob_snapshot/cluster_37:0.017769407246059025 - cluster/prob_snapshot/cluster_38:0.013536451446376654 - cluster/prob_snapshot/cluster_39:0.017435043013177838 - cluster/prob_snapshot/cluster_40:0.026370762269934015 - cluster/prob_snapshot/cluster_41:0.01672055226611688 - cluster/prob_snapshot/cluster_42:0.015878749532406224 - cluster/prob_snapshot/cluster_43:0.02040151430951049 - cluster/prob_snapshot/cluster_44:0.017493435177518706 - cluster/prob_snapshot/cluster_45:0.019979179718843464 - cluster/prob_snapshot/cluster_46:0.017231516199931216 - cluster/prob_snapshot/cluster_47:0.019787760139596787 - cluster/prob_snapshot/cluster_48:0.01280967572806839 - cluster/prob_snapshot/cluster_49:0.02414915094229816 - cluster/prob_snapshot/cluster_50:0.0136811162371456 - cluster/prob_snapshot/cluster_51:0.01521868082415038 - cluster/prob_snapshot/cluster_52:0.012896121576544232 - cluster/prob_snapshot/cluster_53:0.01773763859944445 - cluster/prob_snapshot/cluster_54:0.013797476808518254 - cluster/prob_snapshot/cluster_55:0.021731251315136785 - cluster/prob_snapshot/cluster_56:0.020343337295615396 - cluster/prob_snapshot/cluster_57:0.017316314413391928 - cluster/prob_snapshot/cluster_58:0.013005574032385822 - cluster/prob_snapshot/cluster_59:0.026902173376806997 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013415181596129717 - cluster/prob_snapshot/cluster_62:0.021924431763115006 - cluster/prob_snapshot/cluster_63:0.008513467426109443
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▎       | 188/800 [5:58:20<19:02:44, 112.03s/it]
[36m(TaskRunner pid=2823680)[0m step:188 - global_seqlen/min:285798 - global_seqlen/max:438053 - global_seqlen/minmax_diff:152255 - global_seqlen/balanced_min:357446 - global_seqlen/balanced_max:357660 - global_seqlen/mean:357585.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.19645861068662876) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013469143770635128 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07334466214160784) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005095204967346945) - actor/ppo_kl:np.float64(1.206180642536007e-05) - actor/pg_clipfrac_lower:np.float64(1.1824173308620327e-06) - actor/grad_norm:np.float64(0.2522984395424525) - perf/mfu/actor:np.float64(0.19750206620074603) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.24499130249023) - actor/lr:np.float64(1e-06) - training/global_step:188 - training/epoch:0 - critic/score/mean:0.6354166865348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.625011146068573 - critic/rewards/max:1.004773736000061 - critic/rewards/min:-0.04697274789214134 - critic/advantages/mean:-0.16893988847732544 - critic/advantages/max:2.474761486053467 - critic/advantages/min:-2.4748611450195312 - critic/returns/mean:-0.16893988847732544 - critic/returns/max:2.474761486053467 - critic/returns/min:-2.4748611450195312 - response_length/mean:1109.1068115234375 - response_length/max:8192.0 - response_length/min:110.0 - response_length/clip_ratio:0.014322916977107525 - response_length_non_aborted/mean:1109.1068115234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:110.0 - response_length_non_aborted/clip_ratio:0.014322916977107525 - response/aborted_ratio:0.0 - prompt_length/mean:229.7291717529297 - prompt_length/max:544.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.2419253885746e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3608638308942318) - timing_s/agent_loop/generate_sequences/max:np.float64(28.274040712974966) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.503178523309543) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.274040712974966) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:205 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.612553663551807 - timing_s/reward:0.0002030935138463974 - timing_s/old_log_prob:10.858701003715396 - timing_s/ref:19.60559037514031 - timing_s/adv:0.07660761661827564 - timing_s/update_actor:21.232069158926606 - timing_s/update_weights:27.405001339502633 - timing_s/step:109.24271133355796 - timing_s/stop_profile:5.9759244322776794e-05 - timing_per_token_ms/adv:7.450464841219308e-05 - timing_per_token_ms/update_actor:0.020649224157847212 - timing_per_token_ms/gen:0.03476492398813775 - timing_per_token_ms/ref:0.019067394109019135 - perf/total_num_tokens:1430341 - perf/time_per_step:109.24271133355796 - perf/throughput:3273.3099136304063 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:1014.0 - frontier/mean_score:2.6323355707243827 - frontier/mean_frontier_pct:0.4638751912825225 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.7698056376299993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.072594475099999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.676550999999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:2.2050556069999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.955441143614529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.7488794099999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.6974499899999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.9717524750999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.492606569999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.203645699999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:2.2319299999999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.470816132569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:3.0483823108999992 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.827692475099999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.874427699999999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9802267325699994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.9717524750999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.841509989999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.9102338129999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.612941389999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.5396463929999995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.435571292798999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9802267325699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.548151989999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.9371636690999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.51 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:1.8659000000000001 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.3623519899999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:112.0 - frontier/cluster_50/score:1.9928369248999998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.8784919899999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.708606431240699 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:1.7068504750999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.1654463929999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.9742947523409997 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5223519899999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.8944351989999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.918660110481462 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:188.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.019485622412127753 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021615746229261698 - cluster/prob_snapshot/cluster_4:0.014703179995038767 - cluster/prob_snapshot/cluster_5:0.010956669102365998 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.020699488477383017 - cluster/prob_snapshot/cluster_8:0.018829574698038753 - cluster/prob_snapshot/cluster_9:0.015512597841526535 - cluster/prob_snapshot/cluster_10:0.03486159956429913 - cluster/prob_snapshot/cluster_11:0.019338406100423904 - cluster/prob_snapshot/cluster_12:0.01897659939262465 - cluster/prob_snapshot/cluster_13:0.019922668192990493 - cluster/prob_snapshot/cluster_14:0.020906321312008255 - cluster/prob_snapshot/cluster_15:0.01753552299307473 - cluster/prob_snapshot/cluster_16:0.015502679125546981 - cluster/prob_snapshot/cluster_17:0.015701659581974576 - cluster/prob_snapshot/cluster_18:0.01286070973633032 - cluster/prob_snapshot/cluster_19:0.024417241304718198 - cluster/prob_snapshot/cluster_20:0.018575873740234283 - cluster/prob_snapshot/cluster_21:0.02144541330662915 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.019892857144503333 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.020221640122404443 - cluster/prob_snapshot/cluster_28:0.02096593783492965 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014471024521432889 - cluster/prob_snapshot/cluster_31:0.020906321312008255 - cluster/prob_snapshot/cluster_32:0.01848000195054724 - cluster/prob_snapshot/cluster_33:0.019990063560129567 - cluster/prob_snapshot/cluster_34:0.020473536641237793 - cluster/prob_snapshot/cluster_35:0.025417094530476338 - cluster/prob_snapshot/cluster_36:0.014240916237003359 - cluster/prob_snapshot/cluster_37:0.018209086432219416 - cluster/prob_snapshot/cluster_38:0.01387139204811015 - cluster/prob_snapshot/cluster_39:0.01786644882298084 - cluster/prob_snapshot/cluster_40:0.027023269983484515 - cluster/prob_snapshot/cluster_41:0.017134278999412896 - cluster/prob_snapshot/cluster_42:0.016271647031741856 - cluster/prob_snapshot/cluster_43:0.02096593783492965 - cluster/prob_snapshot/cluster_44:0.017926285819945556 - cluster/prob_snapshot/cluster_45:0.020662988565390325 - cluster/prob_snapshot/cluster_46:0.017657886022749907 - cluster/prob_snapshot/cluster_47:0.02052567937265592 - cluster/prob_snapshot/cluster_48:0.013126633278824325 - cluster/prob_snapshot/cluster_49:0.023654194505094157 - cluster/prob_snapshot/cluster_50:0.014019636367255622 - cluster/prob_snapshot/cluster_51:0.015595245844386561 - cluster/prob_snapshot/cluster_52:0.013215218109190699 - cluster/prob_snapshot/cluster_53:0.019055085116866793 - cluster/prob_snapshot/cluster_54:0.012007717588522837 - cluster/prob_snapshot/cluster_55:0.02226896080427044 - cluster/prob_snapshot/cluster_56:0.020924206268884676 - cluster/prob_snapshot/cluster_57:0.017744782449671877 - cluster/prob_snapshot/cluster_58:0.013327378813317742 - cluster/prob_snapshot/cluster_59:0.027567830116644706 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.013747121544643661 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.008724121297534726
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 17:30:45,743:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▎       | 189/800 [6:00:11<18:59:33, 111.90s/it]
[36m(TaskRunner pid=2823680)[0m step:189 - global_seqlen/min:342780 - global_seqlen/max:472478 - global_seqlen/minmax_diff:129698 - global_seqlen/balanced_min:395387 - global_seqlen/balanced_max:395510 - global_seqlen/mean:395462.5 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.21843789622187615) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011947428807616234 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05192099930718541) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005231111239118036) - actor/ppo_kl:np.float64(6.64250007685041e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.21031414889372313) - perf/mfu/actor:np.float64(0.21740104938226282) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.56504821777344) - actor/lr:np.float64(1e-06) - training/global_step:189 - training/epoch:0 - critic/score/mean:0.581250011920929 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5700984597206116 - critic/rewards/max:1.0098731517791748 - critic/rewards/min:-0.09024195373058319 - critic/advantages/mean:-0.10777398943901062 - critic/advantages/max:2.474752187728882 - critic/advantages/min:-2.474853277206421 - critic/returns/mean:-0.10777398943901062 - critic/returns/max:2.474752187728882 - critic/returns/min:-2.474853277206421 - response_length/mean:1206.280029296875 - response_length/max:8192.0 - response_length/min:263.0 - response_length/clip_ratio:0.0062500000931322575 - response_length_non_aborted/mean:1206.280029296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:263.0 - response_length_non_aborted/clip_ratio:0.0062500000931322575 - response/aborted_ratio:0.0 - prompt_length/mean:240.49000549316406 - prompt_length/max:395.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.479971438646317e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.075673579238355) - timing_s/agent_loop/generate_sequences/max:np.float64(28.778817613609135) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.164308859644734) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.778817613609135) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.829980568028986 - timing_s/reward:0.0001340862363576889 - timing_s/old_log_prob:10.203581194393337 - timing_s/ref:20.958531497977674 - timing_s/adv:0.08073877729475498 - timing_s/update_actor:21.358829459175467 - timing_s/update_weights:27.122226666659117 - timing_s/step:110.95507974736392 - timing_s/stop_profile:6.941240280866623e-05 - timing_per_token_ms/adv:6.975778570086726e-05 - timing_per_token_ms/update_actor:0.018453891651035986 - timing_per_token_ms/gen:0.03194737184570434 - timing_per_token_ms/ref:0.018108036780187655 - perf/total_num_tokens:1581850 - perf/time_per_step:110.95507974736392 - perf/throughput:3564.1675973775814 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:1042.0 - frontier/mean_score:2.648801724983762 - frontier/mean_frontier_pct:0.4786750361021224 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.238863946340999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.072594475099999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:2.773585699999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:2.4435389248999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.955441143614529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.7488794099999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.6974499899999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.9802267325699994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.644824598999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.4425519899999992 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:2.2319299999999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.470816132569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:3.0483823108999992 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.827692475099999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3120993899999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9802267325699994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.0569999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.9717524750999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.889056992999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.9102338129999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.429058972999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.3170037 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.5883509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.5396463929999995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.435571292798999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9802267325699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.548151989999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.9371636690999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.20613 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:3.8536463929999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:112.0 - frontier/cluster_50/score:1.9928369248999998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.8784919899999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.796024501868489 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:1.7068504750999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.1654463929999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.9820063266387 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5223519899999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.8944351989999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:3.918660110481462 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.2401 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:189.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.015652528107219277 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021481372935675595 - cluster/prob_snapshot/cluster_4:0.014611778351941749 - cluster/prob_snapshot/cluster_5:0.010888557472149644 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.020691721097913816 - cluster/prob_snapshot/cluster_8:0.01939091841555751 - cluster/prob_snapshot/cluster_9:0.017083468499990827 - cluster/prob_snapshot/cluster_10:0.034644884031860404 - cluster/prob_snapshot/cluster_11:0.019218189787146605 - cluster/prob_snapshot/cluster_12:0.018858632234127985 - cluster/prob_snapshot/cluster_13:0.019798819841250904 - cluster/prob_snapshot/cluster_14:0.020835604045380103 - cluster/prob_snapshot/cluster_15:0.018490713311172826 - cluster/prob_snapshot/cluster_16:0.01707656856027475 - cluster/prob_snapshot/cluster_17:0.015604050936387245 - cluster/prob_snapshot/cluster_18:0.012780761724968762 - cluster/prob_snapshot/cluster_19:0.024265452645672966 - cluster/prob_snapshot/cluster_20:0.018460397674349687 - cluster/prob_snapshot/cluster_21:0.021312098879832906 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.019769194111777397 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01616453771021039 - cluster/prob_snapshot/cluster_28:0.020835604045380103 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014381066062174246 - cluster/prob_snapshot/cluster_31:0.02077635812583516 - cluster/prob_snapshot/cluster_32:0.01836512186723968 - cluster/prob_snapshot/cluster_33:0.020198210731025512 - cluster/prob_snapshot/cluster_34:0.020346263841092005 - cluster/prob_snapshot/cluster_35:0.02397351658814019 - cluster/prob_snapshot/cluster_36:0.016198825122023416 - cluster/prob_snapshot/cluster_37:0.01809589048278793 - cluster/prob_snapshot/cluster_38:0.013785161176579774 - cluster/prob_snapshot/cluster_39:0.017755382864509256 - cluster/prob_snapshot/cluster_40:0.026855281066857077 - cluster/prob_snapshot/cluster_41:0.017027764541019712 - cluster/prob_snapshot/cluster_42:0.01617049508535354 - cluster/prob_snapshot/cluster_43:0.020835604045380103 - cluster/prob_snapshot/cluster_44:0.017814847887530755 - cluster/prob_snapshot/cluster_45:0.02053453804605989 - cluster/prob_snapshot/cluster_46:0.018575834967037907 - cluster/prob_snapshot/cluster_47:0.020398082429615275 - cluster/prob_snapshot/cluster_48:0.01542367587347811 - cluster/prob_snapshot/cluster_49:0.02694192676616067 - cluster/prob_snapshot/cluster_50:0.013932483941724394 - cluster/prob_snapshot/cluster_51:0.01549829871491118 - cluster/prob_snapshot/cluster_52:0.013133066312812428 - cluster/prob_snapshot/cluster_53:0.019547794396124737 - cluster/prob_snapshot/cluster_54:0.01193307216366874 - cluster/prob_snapshot/cluster_55:0.022130526832282053 - cluster/prob_snapshot/cluster_56:0.020848045688484543 - cluster/prob_snapshot/cluster_57:0.017634472824621623 - cluster/prob_snapshot/cluster_58:0.013244529775074 - cluster/prob_snapshot/cluster_59:0.027396455966962052 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01366166319499013 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.008669888198202375
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 17:32:36,138:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▍       | 190/800 [6:01:53<18:25:20, 108.72s/it]
[36m(TaskRunner pid=2823680)[0m step:190 - global_seqlen/min:277539 - global_seqlen/max:380215 - global_seqlen/minmax_diff:102676 - global_seqlen/balanced_min:339210 - global_seqlen/balanced_max:339349 - global_seqlen/mean:339255.0 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.22796573246950688) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014230198226869106 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02336430476862006) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00048659654974313605) - actor/ppo_kl:np.float64(-5.833408252442496e-06) - actor/pg_clipfrac_lower:np.float64(1.3135198890307473e-05) - actor/grad_norm:np.float64(0.27418819939096767) - perf/mfu/actor:np.float64(0.20687678718663233) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(103.68968963623047) - actor/lr:np.float64(1e-06) - training/global_step:190 - training/epoch:0 - critic/score/mean:0.6426630616188049 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6320773959159851 - critic/rewards/max:1.0169318914413452 - critic/rewards/min:-0.05581135302782059 - critic/advantages/mean:-0.06556852161884308 - critic/advantages/max:2.4748151302337646 - critic/advantages/min:-2.474863290786743 - critic/returns/mean:-0.06556852161884308 - critic/returns/max:2.4748151302337646 - critic/returns/min:-2.474863290786743 - response_length/mean:994.1589965820312 - response_length/max:8192.0 - response_length/min:127.0 - response_length/clip_ratio:0.005434782709926367 - response_length_non_aborted/mean:994.1589965820312 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:127.0 - response_length_non_aborted/clip_ratio:0.005434782709926367 - response/aborted_ratio:0.0 - prompt_length/mean:242.5 - prompt_length/max:1168.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.56742262840271e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0502266930416226) - timing_s/agent_loop/generate_sequences/max:np.float64(26.512783082202077) - timing_s/agent_loop/generate_sequences/mean:np.float64(5.842388865934481) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(26.512783082202077) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:225 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:28.409623387269676 - timing_s/reward:0.0001660473644733429 - timing_s/old_log_prob:9.03640749398619 - timing_s/ref:18.90555446408689 - timing_s/adv:0.07215562928467989 - timing_s/update_actor:19.10323907621205 - timing_s/update_weights:25.14292967878282 - timing_s/step:101.06492187827826 - timing_s/stop_profile:4.8796646296978e-05 - timing_per_token_ms/adv:7.927613220302323e-05 - timing_per_token_ms/update_actor:0.020988395798431356 - timing_per_token_ms/gen:0.03882682050081888 - timing_per_token_ms/ref:0.02077120316078548 - perf/total_num_tokens:1357020 - perf/time_per_step:101.06492187827826 - perf/throughput:3356.8026739148513 - frontier/active_count:51.0 - frontier/completed_count:13.0 - frontier/blacklisted_count:1078.0 - frontier/mean_score:2.5941112647861733 - frontier/mean_frontier_pct:0.48243937244236645 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.238863946340999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.072594475099999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.241509989999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:2.4435389248999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.7488794099999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.6974499899999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.9802267325699994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.644824598999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.6097863929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.470816132569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.640488392999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:3.0483823108999992 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.279384732569999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3120993899999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9861587127989995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.339899999999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.9717524750999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.889056992999999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.9102338129999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.429058972999999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.3170037 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.1118456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.6777524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.188877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:2.604899904959299 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9802267325699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.548151989999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.9371636690999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.4442909999999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:3.8536463929999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:112.0 - frontier/cluster_50/score:1.9928369248999998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.8784919899999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.796024501868489 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:1.7068504750999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.1158124750999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.9874044286470895 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5223519899999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.8944351989999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:190.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.016922671614522405 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.023224505174462076 - cluster/prob_snapshot/cluster_4:0.015797468949444102 - cluster/prob_snapshot/cluster_5:0.011772122764760052 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.02237077607404389 - cluster/prob_snapshot/cluster_8:0.016942671993729067 - cluster/prob_snapshot/cluster_9:0.018469727412854436 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.020777673265665656 - cluster/prob_snapshot/cluster_12:0.020388938975934592 - cluster/prob_snapshot/cluster_13:0.02140541925454509 - cluster/prob_snapshot/cluster_14:0.022526334578984605 - cluster/prob_snapshot/cluster_15:0.019991164822693034 - cluster/prob_snapshot/cluster_16:0.01972632512349245 - cluster/prob_snapshot/cluster_17:0.01407676176816563 - cluster/prob_snapshot/cluster_18:0.013817872242334337 - cluster/prob_snapshot/cluster_19:0.026234502432298676 - cluster/prob_snapshot/cluster_20:0.019958389186500024 - cluster/prob_snapshot/cluster_21:0.023041495168649475 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01722895193139307 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.017476228861987388 - cluster/prob_snapshot/cluster_28:0.022571172030408693 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017686362485552273 - cluster/prob_snapshot/cluster_31:0.022462281076950193 - cluster/prob_snapshot/cluster_32:0.019855382107676275 - cluster/prob_snapshot/cluster_33:0.02183721920578557 - cluster/prob_snapshot/cluster_34:0.02199728626626307 - cluster/prob_snapshot/cluster_35:0.025918876866880467 - cluster/prob_snapshot/cluster_36:0.017513298567701962 - cluster/prob_snapshot/cluster_37:0.015962591804673226 - cluster/prob_snapshot/cluster_38:0.014903774913966028 - cluster/prob_snapshot/cluster_39:0.020240053387411167 - cluster/prob_snapshot/cluster_40:0.03166203632549639 - cluster/prob_snapshot/cluster_41:0.019689390126796377 - cluster/prob_snapshot/cluster_42:0.017482669655611363 - cluster/prob_snapshot/cluster_43:0.022526334578984605 - cluster/prob_snapshot/cluster_44:0.019260455473918272 - cluster/prob_snapshot/cluster_45:0.022200838211503617 - cluster/prob_snapshot/cluster_46:0.020083193779269367 - cluster/prob_snapshot/cluster_47:0.022053309737430193 - cluster/prob_snapshot/cluster_48:0.018475412045887884 - cluster/prob_snapshot/cluster_49:0.029128162313662567 - cluster/prob_snapshot/cluster_50:0.01506305236479111 - cluster/prob_snapshot/cluster_51:0.016755927089838658 - cluster/prob_snapshot/cluster_52:0.014198765016174382 - cluster/prob_snapshot/cluster_53:0.021134024043135108 - cluster/prob_snapshot/cluster_54:0.012901395876428785 - cluster/prob_snapshot/cluster_55:0.023551172644824318 - cluster/prob_snapshot/cluster_56:0.02258058789520775 - cluster/prob_snapshot/cluster_57:0.019065443656264853 - cluster/prob_snapshot/cluster_58:0.014319273317194478 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014770255537851062 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▍       | 191/800 [6:03:54<19:02:12, 112.53s/it]
[36m(TaskRunner pid=2823680)[0m step:191 - global_seqlen/min:372680 - global_seqlen/max:455676 - global_seqlen/minmax_diff:82996 - global_seqlen/balanced_min:415883 - global_seqlen/balanced_max:416017 - global_seqlen/mean:415957.75 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.20152545460344604) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012455170974135399 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04682844285707688) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004391746919433596) - actor/ppo_kl:np.float64(-8.288223785513082e-06) - actor/pg_clipfrac_lower:np.float64(1.0202655668460163e-06) - actor/grad_norm:np.float64(0.2279980629682541) - perf/mfu/actor:np.float64(0.2196916562208581) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(103.83066177368164) - actor/lr:np.float64(1e-06) - training/global_step:191 - training/epoch:0 - critic/score/mean:0.5789473652839661 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5681049227714539 - critic/rewards/max:1.0076329708099365 - critic/rewards/min:-0.09371192753314972 - critic/advantages/mean:-0.21272337436676025 - critic/advantages/max:2.474681854248047 - critic/advantages/min:-2.474843978881836 - critic/returns/mean:-0.21272337436676025 - critic/returns/max:2.474681854248047 - critic/returns/min:-2.474843978881836 - response_length/mean:1297.6026611328125 - response_length/max:8192.0 - response_length/min:191.0 - response_length/clip_ratio:0.018421052023768425 - response_length_non_aborted/mean:1297.6026611328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:191.0 - response_length_non_aborted/clip_ratio:0.018421052023768425 - response/aborted_ratio:0.0 - prompt_length/mean:238.9263153076172 - prompt_length/max:505.0 - prompt_length/min:187.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.679553866386414e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6539883725345135) - timing_s/agent_loop/generate_sequences/max:np.float64(31.607304389588535) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.395812080735595) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.607304389588535) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:210 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.63641774374992 - timing_s/reward:0.00011095311492681503 - timing_s/old_log_prob:10.38510129135102 - timing_s/ref:24.049187562428415 - timing_s/adv:0.07419052626937628 - timing_s/update_actor:22.37049841042608 - timing_s/update_weights:30.324666707776487 - timing_s/step:121.22185889165848 - timing_s/stop_profile:5.5217184126377106e-05 - timing_per_token_ms/adv:6.3532231969679e-05 - timing_per_token_ms/update_actor:0.019156727492781987 - timing_per_token_ms/gen:0.03410785653680159 - timing_per_token_ms/ref:0.020594254276495052 - perf/total_num_tokens:1663831 - perf/time_per_step:121.22185889165848 - perf/throughput:3431.3757749892325 - frontier/active_count:51.0 - frontier/completed_count:13.0 - frontier/blacklisted_count:1111.0 - frontier/mean_score:2.5902551408123715 - frontier/mean_frontier_pct:0.5048900753969785 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.238863946340999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.072594475099999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.241509989999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:2.0104772474299994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.7488794099999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.6974499899999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.9861587127989995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.644824598999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.726850475099999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.470816132569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.7483418750999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:3.033867617629999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:1.8955693127989994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3120993899999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9861587127989995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.339899999999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.9802267325699994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.6268605899999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.922339895099999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.9102338129999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.3003412810999992 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.3170037 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.1118456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.6777524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.188877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:2.123429933471509 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9802267325699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.548151989999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.9371636690999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.54235199 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.0110037 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:3.8536463929999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:128.0 - frontier/cluster_50/score:2.2949858474299996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.8784919899999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.796024501868489 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:1.7068504750999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.1158124750999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:2.9911831000529623 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.6656463929999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:191.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.016947864468571935 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.023259079595248606 - cluster/prob_snapshot/cluster_4:0.015820986709444467 - cluster/prob_snapshot/cluster_5:0.011789647974574515 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.022404079544645097 - cluster/prob_snapshot/cluster_8:0.016967894622429182 - cluster/prob_snapshot/cluster_9:0.015219011392933263 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.020808605077261025 - cluster/prob_snapshot/cluster_12:0.02041929207712015 - cluster/prob_snapshot/cluster_13:0.02143728559429525 - cluster/prob_snapshot/cluster_14:0.022604773831332428 - cluster/prob_snapshot/cluster_15:0.020020925755785065 - cluster/prob_snapshot/cluster_16:0.02064184934220068 - cluster/prob_snapshot/cluster_17:0.014097717903981153 - cluster/prob_snapshot/cluster_18:0.01383844296819877 - cluster/prob_snapshot/cluster_19:0.026273557848954743 - cluster/prob_snapshot/cluster_20:0.020804536018644394 - cluster/prob_snapshot/cluster_21:0.022965923089348406 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.014349175552451557 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.017502245799093135 - cluster/prob_snapshot/cluster_28:0.022604773831332428 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01771269224948761 - cluster/prob_snapshot/cluster_31:0.022559869630201474 - cluster/prob_snapshot/cluster_32:0.0198849409004562 - cluster/prob_snapshot/cluster_33:0.022121674947778197 - cluster/prob_snapshot/cluster_34:0.022030033720980332 - cluster/prob_snapshot/cluster_35:0.024983088777470824 - cluster/prob_snapshot/cluster_36:0.017539370690638115 - cluster/prob_snapshot/cluster_37:0.015986355383778683 - cluster/prob_snapshot/cluster_38:0.014925962227810279 - cluster/prob_snapshot/cluster_39:0.02027018484198044 - cluster/prob_snapshot/cluster_40:0.03170917173521353 - cluster/prob_snapshot/cluster_41:0.016074046294683876 - cluster/prob_snapshot/cluster_42:0.017508696181154294 - cluster/prob_snapshot/cluster_43:0.022559869630201474 - cluster/prob_snapshot/cluster_44:0.019289128596858596 - cluster/prob_snapshot/cluster_45:0.022233888694877628 - cluster/prob_snapshot/cluster_46:0.020113091716265047 - cluster/prob_snapshot/cluster_47:0.02681507356649003 - cluster/prob_snapshot/cluster_48:0.01522299655997304 - cluster/prob_snapshot/cluster_49:0.029171525534235222 - cluster/prob_snapshot/cluster_50:0.01737269884715463 - cluster/prob_snapshot/cluster_51:0.016780871710596803 - cluster/prob_snapshot/cluster_52:0.014219902778750185 - cluster/prob_snapshot/cluster_53:0.021165486355666247 - cluster/prob_snapshot/cluster_54:0.012920602239983765 - cluster/prob_snapshot/cluster_55:0.023586233376879602 - cluster/prob_snapshot/cluster_56:0.022642807689690347 - cluster/prob_snapshot/cluster_57:0.020178543615182575 - cluster/prob_snapshot/cluster_58:0.016851278426888983 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014792244080825566 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 17:36:22,148:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▍       | 192/800 [6:05:48<19:05:11, 113.01s/it]
[36m(TaskRunner pid=2823680)[0m step:192 - global_seqlen/min:343133 - global_seqlen/max:458564 - global_seqlen/minmax_diff:115431 - global_seqlen/balanced_min:405267 - global_seqlen/balanced_max:405323 - global_seqlen/mean:405297.5 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.22124687699682039) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012599645182490349 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.028043207988957874) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006312791486023976) - actor/ppo_kl:np.float64(-2.5663541012106533e-07) - actor/pg_clipfrac_lower:np.float64(3.889958565117053e-06) - actor/grad_norm:np.float64(0.22308894246816635) - perf/mfu/actor:np.float64(0.22816446235474278) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(103.8530683517456) - actor/lr:np.float64(1e-06) - training/global_step:192 - training/epoch:0 - critic/score/mean:0.6648351550102234 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6527599096298218 - critic/rewards/max:1.0282517671585083 - critic/rewards/min:-0.08116666227579117 - critic/advantages/mean:-0.13867197930812836 - critic/advantages/max:2.4745874404907227 - critic/advantages/min:-2.474848508834839 - critic/returns/mean:-0.13867197930812836 - critic/returns/max:2.4745874404907227 - critic/returns/min:-2.474848508834839 - response_length/mean:1269.4354248046875 - response_length/max:8192.0 - response_length/min:165.0 - response_length/clip_ratio:0.009615384973585606 - response_length_non_aborted/mean:1269.4354248046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:165.0 - response_length_non_aborted/clip_ratio:0.009615384973585606 - response/aborted_ratio:0.0 - prompt_length/mean:236.09890747070312 - prompt_length/max:414.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.969792932271957e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.549936887808144) - timing_s/agent_loop/generate_sequences/max:np.float64(29.23222445882857) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.435575542361221) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.23222445882857) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:220 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.104585307650268 - timing_s/reward:0.00023867562413215637 - timing_s/old_log_prob:10.060126119293272 - timing_s/ref:22.98540930543095 - timing_s/adv:0.07904281374067068 - timing_s/update_actor:20.955313459038734 - timing_s/update_weights:28.36282833479345 - timing_s/step:113.93063763249665 - timing_s/stop_profile:8.532032370567322e-05 - timing_per_token_ms/adv:7.211744738567198e-05 - timing_per_token_ms/update_actor:0.0191193056561813 - timing_per_token_ms/gen:0.033657543651132306 - timing_per_token_ms/ref:0.02097153387860262 - perf/total_num_tokens:1621190 - perf/time_per_step:113.93063763249665 - perf/throughput:3557.4057024709937 - frontier/active_count:51.0 - frontier/completed_count:13.0 - frontier/blacklisted_count:1148.0 - frontier/mean_score:2.604401956980371 - frontier/mean_frontier_pct:0.5216945427865975 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.4672047624386995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.450816132569999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.241509989999999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:2.3073340732009995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.8242155869999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.6974499899999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.9903110989592996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.644824598999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.2087953325699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:112.0 - frontier/cluster_19/score:3.9295712927989994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.7483418750999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:3.033867617629999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:1.8955693127989994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3120993899999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9861587127989995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.5379299999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.9802267325699994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.7388024129999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.922339895099999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.9102338129999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.3003412810999992 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.92190259 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3782919899999992 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.6777524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.432213969999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:2.123429933471509 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9802267325699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.548151989999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.356014568369999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.54235199 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.30770259 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:4.1975524751 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:128.0 - frontier/cluster_50/score:2.2949858474299996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.8784919899999997 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.796024501868489 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:1.7068504750999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.1158124750999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:2.9911831000529623 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.6656463929999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.2678699999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:192.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.01857492229251572 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.018451536698046046 - cluster/prob_snapshot/cluster_4:0.015735048903271733 - cluster/prob_snapshot/cluster_5:0.011725607943374758 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.02228238312451999 - cluster/prob_snapshot/cluster_8:0.01687572694249862 - cluster/prob_snapshot/cluster_9:0.017371298793302816 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.021262760944893433 - cluster/prob_snapshot/cluster_12:0.020308376797502317 - cluster/prob_snapshot/cluster_13:0.021320840689302543 - cluster/prob_snapshot/cluster_14:0.022513249463215704 - cluster/prob_snapshot/cluster_15:0.019912174356861743 - cluster/prob_snapshot/cluster_16:0.01662942706952502 - cluster/prob_snapshot/cluster_17:0.01402114069859187 - cluster/prob_snapshot/cluster_18:0.013763274114866534 - cluster/prob_snapshot/cluster_19:0.02958468730195454 - cluster/prob_snapshot/cluster_20:0.020691528137611496 - cluster/prob_snapshot/cluster_21:0.02284117479878495 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01427123245762366 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01740717558415059 - cluster/prob_snapshot/cluster_28:0.022481987262594847 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.019107393618698766 - cluster/prob_snapshot/cluster_31:0.022437326975993623 - cluster/prob_snapshot/cluster_32:0.020619708088494552 - cluster/prob_snapshot/cluster_33:0.022001512517406924 - cluster/prob_snapshot/cluster_34:0.021910369075363617 - cluster/prob_snapshot/cluster_35:0.024847383471576542 - cluster/prob_snapshot/cluster_36:0.014469488632045264 - cluster/prob_snapshot/cluster_37:0.01790552189899973 - cluster/prob_snapshot/cluster_38:0.014844886093152418 - cluster/prob_snapshot/cluster_39:0.020160079495959528 - cluster/prob_snapshot/cluster_40:0.03336894907546131 - cluster/prob_snapshot/cluster_41:0.01598673389753361 - cluster/prob_snapshot/cluster_42:0.017413590928467674 - cluster/prob_snapshot/cluster_43:0.022437326975993623 - cluster/prob_snapshot/cluster_44:0.01918435223713836 - cluster/prob_snapshot/cluster_45:0.017737801172307454 - cluster/prob_snapshot/cluster_46:0.020003839682293297 - cluster/prob_snapshot/cluster_47:0.026669417126914802 - cluster/prob_snapshot/cluster_48:0.01737407325734777 - cluster/prob_snapshot/cluster_49:0.03160224567930517 - cluster/prob_snapshot/cluster_50:0.017278332316568213 - cluster/prob_snapshot/cluster_51:0.016689719917920436 - cluster/prob_snapshot/cluster_52:0.014142661878973317 - cluster/prob_snapshot/cluster_53:0.021050517833323762 - cluster/prob_snapshot/cluster_54:0.012850418993431142 - cluster/prob_snapshot/cluster_55:0.023458115631159154 - cluster/prob_snapshot/cluster_56:0.02251981452534608 - cluster/prob_snapshot/cluster_57:0.020068936053916218 - cluster/prob_snapshot/cluster_58:0.016759744192911763 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017074184380987012 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▍       | 193/800 [6:07:40<19:00:49, 112.77s/it]
[36m(TaskRunner pid=2823680)[0m step:193 - global_seqlen/min:371298 - global_seqlen/max:432498 - global_seqlen/minmax_diff:61200 - global_seqlen/balanced_min:408665 - global_seqlen/balanced_max:408725 - global_seqlen/mean:408692.5 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.23544465251266955) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013580589555203915 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.03212845303642098) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00041558160439308266) - actor/ppo_kl:np.float64(5.517815008935223e-06) - actor/pg_clipfrac_lower:np.float64(4.0063501182885375e-06) - actor/grad_norm:np.float64(0.25360973408588994) - perf/mfu/actor:np.float64(0.22529759139248579) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.5681381225586) - actor/lr:np.float64(1e-06) - training/global_step:193 - training/epoch:0 - critic/score/mean:0.581250011920929 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5694170594215393 - critic/rewards/max:1.0112345218658447 - critic/rewards/min:-0.09783691167831421 - critic/advantages/mean:-0.1047089621424675 - critic/advantages/max:2.4747345447540283 - critic/advantages/min:-2.474832534790039 - critic/returns/mean:-0.1047089621424675 - critic/returns/max:2.4747345447540283 - critic/returns/min:-2.474832534790039 - response_length/mean:1212.4112548828125 - response_length/max:8192.0 - response_length/min:160.0 - response_length/clip_ratio:0.008750000037252903 - response_length_non_aborted/mean:1212.4112548828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:160.0 - response_length_non_aborted/clip_ratio:0.008750000037252903 - response/aborted_ratio:0.0 - prompt_length/mean:240.5 - prompt_length/max:399.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.313637226819992e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2890764689072967) - timing_s/agent_loop/generate_sequences/max:np.float64(28.902845083735883) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.378660819935249) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.902845083735883) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:226 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.782907941378653 - timing_s/reward:0.00012123491615056992 - timing_s/old_log_prob:10.57496094238013 - timing_s/ref:20.792143690399826 - timing_s/adv:0.08224139455705881 - timing_s/update_actor:21.462824477814138 - timing_s/update_weights:27.84106591064483 - timing_s/step:111.98680278658867 - timing_s/stop_profile:5.837436765432358e-05 - timing_per_token_ms/adv:7.075569357476137e-05 - timing_per_token_ms/update_actor:0.01846536090712194 - timing_per_token_ms/gen:0.03173727967859364 - timing_per_token_ms/ref:0.01788834632053388 - perf/total_num_tokens:1634770 - perf/time_per_step:111.98680278658867 - perf/throughput:3649.47020390285 - frontier/active_count:49.0 - frontier/completed_count:15.0 - frontier/blacklisted_count:1176.0 - frontier/mean_score:2.570611845783244 - frontier/mean_frontier_pct:0.5216376447361541 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.4672047624386995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.6155712927989994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:1.8690569929999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.9151338512406997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.8242155869999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.7882149929999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.9903110989592996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.151377219299999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.2087953325699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:1.8280999999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.7483418750999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.423707332340999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:1.8955693127989994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3120993899999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9903110989592996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.5379299999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.9802267325699994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:2.2171616890999992 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.922339895099999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.9102338129999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.3003412810999992 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.92190259 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3782919899999992 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.6777524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.432213969999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:2.9864009534300564 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9802267325699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.683706392999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.356014568369999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.979646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.30770259 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:4.1975524751 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.857217151307942 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:1.7068504750999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.0810687325699995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:2.9911831000529623 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.6656463929999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.2678699999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:193.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.019587211380584075 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.020765097641268222 - cluster/prob_snapshot/cluster_4:0.016592571645717973 - cluster/prob_snapshot/cluster_5:0.012364625689189418 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.023496720010451326 - cluster/prob_snapshot/cluster_8:0.014838498596307507 - cluster/prob_snapshot/cluster_9:0.01520430413294311 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.022421530847010016 - cluster/prob_snapshot/cluster_12:0.02213572106938638 - cluster/prob_snapshot/cluster_13:0.02248277579935794 - cluster/prob_snapshot/cluster_14:0.023740168015534847 - cluster/prob_snapshot/cluster_15:0.017079847200096053 - cluster/prob_snapshot/cluster_16:0.017535691294925923 - cluster/prob_snapshot/cluster_17:0.014785259520083496 - cluster/prob_snapshot/cluster_18:0.014513339820831113 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.021819167210298383 - cluster/prob_snapshot/cluster_21:0.019241884000057424 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.015048980685187457 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.018355825253873595 - cluster/prob_snapshot/cluster_28:0.023740168015534847 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02014870112766364 - cluster/prob_snapshot/cluster_31:0.02366010793332617 - cluster/prob_snapshot/cluster_32:0.017602112046187952 - cluster/prob_snapshot/cluster_33:0.02320054262324722 - cluster/prob_snapshot/cluster_34:0.023104432080379666 - cluster/prob_snapshot/cluster_35:0.02620150677606334 - cluster/prob_snapshot/cluster_36:0.015258041349600926 - cluster/prob_snapshot/cluster_37:0.018881330257661325 - cluster/prob_snapshot/cluster_38:0.015653897079527888 - cluster/prob_snapshot/cluster_39:0.02125875588162458 - cluster/prob_snapshot/cluster_40:0.03518747743845794 - cluster/prob_snapshot/cluster_41:0.02370912525484626 - cluster/prob_snapshot/cluster_42:0.0183625902191983 - cluster/prob_snapshot/cluster_43:0.02366010793332617 - cluster/prob_snapshot/cluster_44:0.021306024211446815 - cluster/prob_snapshot/cluster_45:0.01870446915025575 - cluster/prob_snapshot/cluster_46:0.021094001369699834 - cluster/prob_snapshot/cluster_47:0.03159453009592134 - cluster/prob_snapshot/cluster_48:0.018320918928987528 - cluster/prob_snapshot/cluster_49:0.03332449290897491 - cluster/prob_snapshot/cluster_50:0.015135681082913457 - cluster/prob_snapshot/cluster_51:0.017599269960163963 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.022683531239450355 - cluster/prob_snapshot/cluster_54:0.013550736266327515 - cluster/prob_snapshot/cluster_55:0.024460695545717316 - cluster/prob_snapshot/cluster_56:0.023747090857937594 - cluster/prob_snapshot/cluster_57:0.021162645338757027 - cluster/prob_snapshot/cluster_58:0.01767311039160307 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01800468682209302 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 17:40:08,627:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▍       | 194/800 [6:09:31<18:51:51, 112.07s/it]
[36m(TaskRunner pid=2823680)[0m step:194 - global_seqlen/min:297993 - global_seqlen/max:418933 - global_seqlen/minmax_diff:120940 - global_seqlen/balanced_min:374309 - global_seqlen/balanced_max:374496 - global_seqlen/mean:374375.5 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.20478933897553658) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013289681635797024 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.10008201518576243) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00034585290684065385) - actor/ppo_kl:np.float64(2.8935962187897158e-05) - actor/pg_clipfrac_lower:np.float64(2.608549818562876e-06) - actor/grad_norm:np.float64(0.23257805521671587) - perf/mfu/actor:np.float64(0.20026070076203284) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.42528915405273) - actor/lr:np.float64(1e-06) - training/global_step:194 - training/epoch:0 - critic/score/mean:0.6645408272743225 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6539097428321838 - critic/rewards/max:1.0169044733047485 - critic/rewards/min:-0.04711611941456795 - critic/advantages/mean:-0.12174560129642487 - critic/advantages/max:2.4747695922851562 - critic/advantages/min:-2.4748592376708984 - critic/returns/mean:-0.12174560129642487 - critic/returns/max:2.4747695922851562 - critic/returns/min:-2.4748592376708984 - response_length/mean:1122.965576171875 - response_length/max:8192.0 - response_length/min:217.0 - response_length/clip_ratio:0.015306122601032257 - response_length_non_aborted/mean:1122.965576171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:217.0 - response_length_non_aborted/clip_ratio:0.015306122601032257 - response/aborted_ratio:0.0 - prompt_length/mean:236.52040100097656 - prompt_length/max:434.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010510161519050598 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.488566316664219) - timing_s/agent_loop/generate_sequences/max:np.float64(28.418108159676194) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.509798549659536) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.418108159676194) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:217 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.071726569905877 - timing_s/reward:0.00012393947690725327 - timing_s/old_log_prob:9.794261185452342 - timing_s/ref:20.158008239232004 - timing_s/adv:0.0918356329202652 - timing_s/update_actor:22.024855165742338 - timing_s/update_weights:27.627888644114137 - timing_s/step:110.1822419911623 - timing_s/stop_profile:7.140077650547028e-05 - timing_per_token_ms/adv:8.616292446243205e-05 - timing_per_token_ms/update_actor:0.02066437472685067 - timing_per_token_ms/gen:0.03415669671333747 - timing_per_token_ms/ref:0.018912843370263936 - perf/total_num_tokens:1497502 - perf/time_per_step:110.1822419911623 - perf/throughput:3397.784372821426 - frontier/active_count:47.0 - frontier/completed_count:17.0 - frontier/blacklisted_count:1206.0 - frontier/mean_score:2.57927946707701 - frontier/mean_frontier_pct:0.5274951649374995 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.4672047624386995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.6155712927989994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:1.8690569929999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:2.2405936958684896 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.8242155869999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.8517504950999992 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.9903110989592996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.151377219299999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.2087953325699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.77967 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.8238393125699996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.596595132638699 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:1.8955693127989994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3120993899999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9903110989592996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.5379299999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.9861587127989995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:2.2171616890999992 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.922339895099999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.5371636690999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.210238896769999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:1.6453318129999999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3782919899999992 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.6777524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:2.9864009534300564 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9802267325699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.7785944750999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.356014568369999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.979646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.30770259 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:4.1975524751 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.2168036999999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.857217151307942 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.0947953325699995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.0567481127989993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.6656463929999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.2678699999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:194.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.020352086317055245 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021575968695374807 - cluster/prob_snapshot/cluster_4:0.01949249640701293 - cluster/prob_snapshot/cluster_5:0.012847460744407296 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.024414260127626437 - cluster/prob_snapshot/cluster_8:0.015417937672685096 - cluster/prob_snapshot/cluster_9:0.018482761136814373 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02329708513847965 - cluster/prob_snapshot/cluster_12:0.02352422187026411 - cluster/prob_snapshot/cluster_13:0.02336072168849434 - cluster/prob_snapshot/cluster_14:0.02466721470686206 - cluster/prob_snapshot/cluster_15:0.017746810290873764 - cluster/prob_snapshot/cluster_16:0.018220454965699386 - cluster/prob_snapshot/cluster_17:0.0153626196259403 - cluster/prob_snapshot/cluster_18:0.022929626528853844 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.023293981233285056 - cluster/prob_snapshot/cluster_21:0.021419433471615366 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01563663902622872 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01907261491842225 - cluster/prob_snapshot/cluster_28:0.02466721470686206 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.020935502076280286 - cluster/prob_snapshot/cluster_31:0.024632961481170064 - cluster/prob_snapshot/cluster_32:0.018289469428078062 - cluster/prob_snapshot/cluster_33:0.0241065170991567 - cluster/prob_snapshot/cluster_34:0.029178226877252824 - cluster/prob_snapshot/cluster_35:0.02648140929367007 - cluster/prob_snapshot/cluster_36:0.013572418304378574 - cluster/prob_snapshot/cluster_37:0.019618640740542793 - cluster/prob_snapshot/cluster_38:0.01626517723072382 - cluster/prob_snapshot/cluster_39:0.022088904147167465 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.024634959735351406 - cluster/prob_snapshot/cluster_42:0.01907964405368531 - cluster/prob_snapshot/cluster_43:0.024584028301610075 - cluster/prob_snapshot/cluster_44:0.022920754474156885 - cluster/prob_snapshot/cluster_45:0.019434873258071236 - cluster/prob_snapshot/cluster_46:0.022766542883501897 - cluster/prob_snapshot/cluster_47:0.03282828735367519 - cluster/prob_snapshot/cluster_48:0.019036345511650204 - cluster/prob_snapshot/cluster_49:0.03462580471398011 - cluster/prob_snapshot/cluster_50:0.015726725049397476 - cluster/prob_snapshot/cluster_51:0.018286516359417252 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.023569316570429873 - cluster/prob_snapshot/cluster_54:0.01728006368749394 - cluster/prob_snapshot/cluster_55:0.025215256710062656 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.021989040515412388 - cluster/prob_snapshot/cluster_58:0.018363240238336888 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.018707764632489382 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 17:41:52,095:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 17:42:01,600:WARNING: Error in configuration: macro '\frac' failed its substitution![32m [repeated 2x across cluster][0m
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▍       | 195/800 [6:11:23<18:48:48, 111.95s/it]
[36m(TaskRunner pid=2823680)[0m step:195 - global_seqlen/min:332865 - global_seqlen/max:401571 - global_seqlen/minmax_diff:68706 - global_seqlen/balanced_min:375934 - global_seqlen/balanced_max:376099 - global_seqlen/mean:376005.5 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.2336784709799797) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012733974494040012 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05624981851360644) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005722888778651163) - actor/ppo_kl:np.float64(1.9631644770748473e-05) - actor/pg_clipfrac_lower:np.float64(7.241802198590681e-06) - actor/grad_norm:np.float64(0.26169127598404884) - perf/mfu/actor:np.float64(0.21074213136831715) - perf/max_memory_allocated_gb:np.float64(79.50338554382324) - perf/max_memory_reserved_gb:np.float64(85.24609375) - perf/cpu_memory_used_gb:np.float64(104.30612564086914) - actor/lr:np.float64(1e-06) - training/global_step:195 - training/epoch:0 - critic/score/mean:0.5833333134651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5733377933502197 - critic/rewards/max:1.0277087688446045 - critic/rewards/min:-0.05887327343225479 - critic/advantages/mean:-0.16412797570228577 - critic/advantages/max:2.474802255630493 - critic/advantages/min:-2.474848985671997 - critic/returns/mean:-0.16412797570228577 - critic/returns/max:2.474802255630493 - critic/returns/min:-2.474848985671997 - response_length/mean:1173.2003173828125 - response_length/max:8192.0 - response_length/min:117.0 - response_length/clip_ratio:0.016129031777381897 - response_length_non_aborted/mean:1173.2003173828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:117.0 - response_length_non_aborted/clip_ratio:0.016129031777381897 - response/aborted_ratio:0.0 - prompt_length/mean:233.68817138671875 - prompt_length/max:495.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.954387158155441e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0407978110015392) - timing_s/agent_loop/generate_sequences/max:np.float64(28.34435474779457) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.55767481869043) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.34435474779457) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:221 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.690859828144312 - timing_s/reward:0.00015677791088819504 - timing_s/old_log_prob:9.620334209874272 - timing_s/ref:21.422627507708967 - timing_s/adv:0.07249798625707626 - timing_s/update_actor:20.9124868363142 - timing_s/update_weights:28.352025857195258 - timing_s/step:111.44963739532977 - timing_s/stop_profile:5.428958684206009e-05 - timing_per_token_ms/adv:6.926173183699278e-05 - timing_per_token_ms/update_actor:0.019978969487032604 - timing_per_token_ms/gen:0.035161222494926814 - timing_per_token_ms/ref:0.020466337870700485 - perf/total_num_tokens:1504022 - perf/time_per_step:111.44963739532977 - perf/throughput:3373.77051004884 - frontier/active_count:47.0 - frontier/completed_count:17.0 - frontier/blacklisted_count:1241.0 - frontier/mean_score:2.562110756243769 - frontier/mean_frontier_pct:0.5380156408833858 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.4672047624386995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.6155712927989994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:1.8690569929999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:2.2405936958684896 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.8242155869999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.896225346569999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.9903110989592996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.151377219299999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.2087953325699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:2.8457689999999998 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:3.4766875187989994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.7176165928470892 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:1.8955693127989994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:1.9184695729999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9903110989592996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0765509999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.9903110989592996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.8520131823699995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.922339895099999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.3760145683699996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.210238896769999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.0517322690999995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3782919899999992 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.6777524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:2.390480667401039 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.312951504899999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9802267325699994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.7785944750999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.549210197858999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.6857524750999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.30770259 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.83828673257 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.4517625899999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.857217151307942 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.0947953325699995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.0567481127989993 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.7659524750999993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.2678699999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:195.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0204884657003349 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021720549317650883 - cluster/prob_snapshot/cluster_4:0.01962311567607241 - cluster/prob_snapshot/cluster_5:0.0129335516122297 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.024577860148163128 - cluster/prob_snapshot/cluster_8:0.015521253313081277 - cluster/prob_snapshot/cluster_9:0.018606614167205243 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.023453198966511952 - cluster/prob_snapshot/cluster_12:0.024051191282148124 - cluster/prob_snapshot/cluster_13:0.023517261945213603 - cluster/prob_snapshot/cluster_14:0.024832509776691308 - cluster/prob_snapshot/cluster_15:0.01786573171273419 - cluster/prob_snapshot/cluster_16:0.018342550281756215 - cluster/prob_snapshot/cluster_17:0.015465564579961543 - cluster/prob_snapshot/cluster_18:0.02363218547371177 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.028871536754528124 - cluster/prob_snapshot/cluster_21:0.022567966468325103 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.015741420184963124 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01593159134659511 - cluster/prob_snapshot/cluster_28:0.024832509776691308 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017244350605274582 - cluster/prob_snapshot/cluster_31:0.024832509776691308 - cluster/prob_snapshot/cluster_32:0.015379716001378543 - cluster/prob_snapshot/cluster_33:0.024268054932860182 - cluster/prob_snapshot/cluster_34:0.02803551603860778 - cluster/prob_snapshot/cluster_35:0.026658861286138293 - cluster/prob_snapshot/cluster_36:0.017038247842945334 - cluster/prob_snapshot/cluster_37:0.019750105303108948 - cluster/prob_snapshot/cluster_38:0.01637417022559931 - cluster/prob_snapshot/cluster_39:0.02223692195123847 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.01985132401939288 - cluster/prob_snapshot/cluster_42:0.01920749679805267 - cluster/prob_snapshot/cluster_43:0.024748765939121637 - cluster/prob_snapshot/cluster_44:0.023074346509429963 - cluster/prob_snapshot/cluster_45:0.021169465338641804 - cluster/prob_snapshot/cluster_46:0.022919101546505394 - cluster/prob_snapshot/cluster_47:0.030607679717417482 - cluster/prob_snapshot/cluster_48:0.01916390811237491 - cluster/prob_snapshot/cluster_49:0.03187437348758152 - cluster/prob_snapshot/cluster_50:0.01583210987480715 - cluster/prob_snapshot/cluster_51:0.020360228909791323 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.023727254621994858 - cluster/prob_snapshot/cluster_54:0.01739585743915264 - cluster/prob_snapshot/cluster_55:0.025384224210779016 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.022969363255779134 - cluster/prob_snapshot/cluster_58:0.01848629235887658 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.018833125411889265 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  24%|██▍       | 196/800 [6:13:20<19:02:54, 113.53s/it]
[36m(TaskRunner pid=2823680)[0m step:196 - global_seqlen/min:362458 - global_seqlen/max:452088 - global_seqlen/minmax_diff:89630 - global_seqlen/balanced_min:408930 - global_seqlen/balanced_max:409088 - global_seqlen/mean:409013.0 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.2568432776143356) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012212091125547886 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04048023522773292) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005657847947559833) - actor/ppo_kl:np.float64(-1.0398898105134377e-05) - actor/pg_clipfrac_lower:np.float64(5.64218658392023e-06) - actor/grad_norm:np.float64(0.2382025306041424) - perf/mfu/actor:np.float64(0.20682029407322425) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.48833847045898) - actor/lr:np.float64(1e-06) - training/global_step:196 - training/epoch:0 - critic/score/mean:0.5841836929321289 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5735399127006531 - critic/rewards/max:1.036047101020813 - critic/rewards/min:-0.06436662375926971 - critic/advantages/mean:-0.16698838770389557 - critic/advantages/max:2.4748518466949463 - critic/advantages/min:-2.4748330116271973 - critic/returns/mean:-0.16698838770389557 - critic/returns/max:2.4748518466949463 - critic/returns/min:-2.4748330116271973 - response_length/mean:1292.478271484375 - response_length/max:8192.0 - response_length/min:184.0 - response_length/clip_ratio:0.01913265325129032 - response_length_non_aborted/mean:1292.478271484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:184.0 - response_length_non_aborted/clip_ratio:0.01913265325129032 - response/aborted_ratio:0.0 - prompt_length/mean:238.80612182617188 - prompt_length/max:404.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.363742381334305e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.409366904757917) - timing_s/agent_loop/generate_sequences/max:np.float64(30.29376073088497) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.386396357949707) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.29376073088497) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:186 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.11984829045832 - timing_s/reward:0.00011950545012950897 - timing_s/old_log_prob:10.77682124543935 - timing_s/ref:22.03609419055283 - timing_s/adv:0.07261850964277983 - timing_s/update_actor:23.312224958091974 - timing_s/update_weights:28.32915015053004 - timing_s/step:117.03312085103244 - timing_s/stop_profile:5.881860852241516e-05 - timing_per_token_ms/adv:6.048886001129489e-05 - timing_per_token_ms/update_actor:0.019418326250131797 - timing_per_token_ms/gen:0.03169816756731039 - timing_per_token_ms/ref:0.018355350767248743 - perf/total_num_tokens:1636052 - perf/time_per_step:117.03312085103244 - perf/throughput:3494.848270521804 - frontier/active_count:46.0 - frontier/completed_count:18.0 - frontier/blacklisted_count:1269.0 - frontier/mean_score:2.5376505873689394 - frontier/mean_frontier_pct:0.5533256894003035 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.4672047624386995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.6155712927989994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.371752475099999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:1.8690569929999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:2.2405936958684896 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.2769509108999992 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.896225346569999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.9903110989592996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.151377219299999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:1.8461567327989996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.4920383 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:3.4766875187989994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.802331614992962 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:1.8955693127989994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:1.9184695729999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9903110989592996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0765509999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:3.5932177692715097 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.8520131823699995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.922339895099999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2632101978589994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.210238896769999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.0517322690999995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3782919899999992 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:1.9717645699999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.6777524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:2.5733364671807273 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.519066053429999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3861587127989994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.2450161325699995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.549210197858999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.4800267325699994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.30770259 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.5868007127989996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.4517625899999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.857217151307942 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.0947953325699995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.0397236789592994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.8361667325699993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:196.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.02113564665164596 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.022406648802892788 - cluster/prob_snapshot/cluster_4:0.020242962318406395 - cluster/prob_snapshot/cluster_5:0.01334209114655397 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.020317941592059414 - cluster/prob_snapshot/cluster_8:0.01601153206951035 - cluster/prob_snapshot/cluster_9:0.01919435199167362 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.019505811041138302 - cluster/prob_snapshot/cluster_12:0.02481091009573847 - cluster/prob_snapshot/cluster_13:0.024260115225715032 - cluster/prob_snapshot/cluster_14:0.025616908546993455 - cluster/prob_snapshot/cluster_15:0.018430066856947872 - cluster/prob_snapshot/cluster_16:0.01581535386200695 - cluster/prob_snapshot/cluster_17:0.015954084264344675 - cluster/prob_snapshot/cluster_18:0.029915023157567475 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.02978351859328057 - cluster/prob_snapshot/cluster_21:0.024006556617004435 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.016238653479017207 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01643483168652061 - cluster/prob_snapshot/cluster_28:0.025616908546993455 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017789057827020362 - cluster/prob_snapshot/cluster_31:0.030781790903593512 - cluster/prob_snapshot/cluster_32:0.015865523937328742 - cluster/prob_snapshot/cluster_33:0.025034623943328393 - cluster/prob_snapshot/cluster_34:0.027954735959500304 - cluster/prob_snapshot/cluster_35:0.027500950072110716 - cluster/prob_snapshot/cluster_36:0.017576444778184406 - cluster/prob_snapshot/cluster_37:0.020373963239838238 - cluster/prob_snapshot/cluster_38:0.016891390559153105 - cluster/prob_snapshot/cluster_39:0.022939332395881827 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.022044838399373892 - cluster/prob_snapshot/cluster_42:0.021579923485890898 - cluster/prob_snapshot/cluster_43:0.020441354595398756 - cluster/prob_snapshot/cluster_44:0.01923223739984298 - cluster/prob_snapshot/cluster_45:0.021838157417248007 - cluster/prob_snapshot/cluster_46:0.02364306039042311 - cluster/prob_snapshot/cluster_47:0.02981212442423252 - cluster/prob_snapshot/cluster_48:0.019769249501252157 - cluster/prob_snapshot/cluster_49:0.030726818312663294 - cluster/prob_snapshot/cluster_50:0.016332207836260236 - cluster/prob_snapshot/cluster_51:0.021003359171836 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.024476741061968323 - cluster/prob_snapshot/cluster_54:0.017945350394408845 - cluster/prob_snapshot/cluster_55:0.026040208164003712 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.02429640977407207 - cluster/prob_snapshot/cluster_58:0.019070229509174596 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  25%|██▍       | 197/800 [6:15:21<19:24:00, 115.82s/it]
[36m(TaskRunner pid=2823680)[0m step:197 - global_seqlen/min:352293 - global_seqlen/max:477586 - global_seqlen/minmax_diff:125293 - global_seqlen/balanced_min:405457 - global_seqlen/balanced_max:405543 - global_seqlen/mean:405502.75 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.2677510059641107) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01276414468884468 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.02394550817552954) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00033837149958739957) - actor/ppo_kl:np.float64(0.00010265821669750237) - actor/pg_clipfrac_lower:np.float64(3.107851250166752e-06) - actor/grad_norm:np.float64(0.25672897696495056) - perf/mfu/actor:np.float64(0.20427149338259526) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(103.84225463867188) - actor/lr:np.float64(1e-06) - training/global_step:197 - training/epoch:0 - critic/score/mean:0.5590659379959106 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5488041639328003 - critic/rewards/max:1.012637972831726 - critic/rewards/min:-0.06952658295631409 - critic/advantages/mean:-0.16076958179473877 - critic/advantages/max:2.474792957305908 - critic/advantages/min:-2.4748497009277344 - critic/returns/mean:-0.16076958179473877 - critic/returns/max:2.474792957305908 - critic/returns/min:-2.4748497009277344 - response_length/mean:1318.69091796875 - response_length/max:8192.0 - response_length/min:98.0 - response_length/clip_ratio:0.024725275114178658 - response_length_non_aborted/mean:1318.69091796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:98.0 - response_length_non_aborted/clip_ratio:0.024725275114178658 - response/aborted_ratio:0.0 - prompt_length/mean:245.85714721679688 - prompt_length/max:962.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.544977754354477e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9206224447116256) - timing_s/agent_loop/generate_sequences/max:np.float64(29.92080548685044) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.063472075418758) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.92080548685044) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.917631975375116 - timing_s/reward:0.00022970233112573624 - timing_s/old_log_prob:11.901645267382264 - timing_s/ref:22.904672037810087 - timing_s/adv:0.08431089296936989 - timing_s/update_actor:23.416960113681853 - timing_s/update_weights:30.330421661026776 - timing_s/step:120.93394507747144 - timing_s/stop_profile:6.492342799901962e-05 - timing_per_token_ms/adv:7.40224400099473e-05 - timing_per_token_ms/update_actor:0.020559389945734297 - timing_per_token_ms/gen:0.03324729087951975 - timing_per_token_ms/ref:0.020109616351498903 - perf/total_num_tokens:1622011 - perf/time_per_step:120.93394507747144 - perf/throughput:3353.0928784323632 - frontier/active_count:45.0 - frontier/completed_count:19.0 - frontier/blacklisted_count:1304.0 - frontier/mean_score:2.5153829725971164 - frontier/mean_frontier_pct:0.5670058279489 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.4672047624386995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:2.7308999049592995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.371752475099999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.6083398950999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:2.4684155871079425 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.2769509108999992 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.896225346569999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.2823509999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.9903110989592996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.405964053509999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.1923097129592994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.4920383 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:3.4766875187989994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.802331614992962 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.2268985189592994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:1.9184695729999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9932177692715096 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0765509999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:3.5932177692715097 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.8520131823699995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.945637926569999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2632101978589994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.210238896769999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.0517322690999995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3782919899999992 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:1.6802351989999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.6777524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.101335527026509 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:2.663346237400999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3861587127989994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.2450161325699995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.549210197858999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.3360187127989995 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:1.9153918129999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.5868007127989996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.6162338129999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.857217151307942 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.0947953325699995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.8361667325699993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:197.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.021796590457964928 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02412621267448293 - cluster/prob_snapshot/cluster_4:0.02087599052835033 - cluster/prob_snapshot/cluster_5:0.013759318622580648 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.020953314517892105 - cluster/prob_snapshot/cluster_8:0.014208924425084875 - cluster/prob_snapshot/cluster_9:0.021807287522851347 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.020115787409846403 - cluster/prob_snapshot/cluster_12:0.025586784977978613 - cluster/prob_snapshot/cluster_13:0.02016349465017812 - cluster/prob_snapshot/cluster_14:0.026417988226278128 - cluster/prob_snapshot/cluster_15:0.02125555767779353 - cluster/prob_snapshot/cluster_16:0.019368022345725253 - cluster/prob_snapshot/cluster_17:0.016452992736548353 - cluster/prob_snapshot/cluster_18:0.03085051141575819 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.030714894503800987 - cluster/prob_snapshot/cluster_21:0.024757278142991862 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01967359813347089 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.01694877386156424 - cluster/prob_snapshot/cluster_28:0.026443667287601332 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.018345348712499536 - cluster/prob_snapshot/cluster_31:0.03174438430707243 - cluster/prob_snapshot/cluster_32:0.016361662993455806 - cluster/prob_snapshot/cluster_33:0.02602332181761524 - cluster/prob_snapshot/cluster_34:0.028828923056504734 - cluster/prob_snapshot/cluster_35:0.028360946594461416 - cluster/prob_snapshot/cluster_36:0.018126086930360698 - cluster/prob_snapshot/cluster_37:0.02101108804777463 - cluster/prob_snapshot/cluster_38:0.014844085526756168 - cluster/prob_snapshot/cluster_39:0.023656680197822366 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0185643083195478 - cluster/prob_snapshot/cluster_42:0.023529407882226305 - cluster/prob_snapshot/cluster_43:0.021080586833488166 - cluster/prob_snapshot/cluster_44:0.01983365870483496 - cluster/prob_snapshot/cluster_45:0.022521069803334126 - cluster/prob_snapshot/cluster_46:0.02438241483673046 - cluster/prob_snapshot/cluster_47:0.02947215194701286 - cluster/prob_snapshot/cluster_48:0.016921583303541165 - cluster/prob_snapshot/cluster_49:0.03168769263964119 - cluster/prob_snapshot/cluster_50:0.016842940807472624 - cluster/prob_snapshot/cluster_51:0.023113191832474764 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.025242165970437883 - cluster/prob_snapshot/cluster_54:0.01850652878610402 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.025056195448985873 - cluster/prob_snapshot/cluster_58:0.019666584581101793 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 17:47:43,544:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 17:47:46,964:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  25%|██▍       | 198/800 [6:17:02<18:36:23, 111.27s/it]
[36m(TaskRunner pid=2823680)[0m step:198 - global_seqlen/min:316055 - global_seqlen/max:377805 - global_seqlen/minmax_diff:61750 - global_seqlen/balanced_min:351439 - global_seqlen/balanced_max:351527 - global_seqlen/mean:351480.5 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.2326472166779579) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014309044927358627 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0971068041108083) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005428688477610087) - actor/ppo_kl:np.float64(-0.00021655129429585512) - actor/pg_clipfrac_lower:np.float64(2.914861953981659e-05) - actor/grad_norm:np.float64(0.24688159960966843) - perf/mfu/actor:np.float64(0.18162326532760126) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(103.85849285125732) - actor/lr:np.float64(1e-06) - training/global_step:198 - training/epoch:0 - critic/score/mean:0.654411792755127 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6442372798919678 - critic/rewards/max:1.0083272457122803 - critic/rewards/min:-0.13256125152111053 - critic/advantages/mean:-0.1618153303861618 - critic/advantages/max:2.4748382568359375 - critic/advantages/min:-2.474853992462158 - critic/returns/mean:-0.1618153303861618 - critic/returns/max:2.4748382568359375 - critic/returns/min:-2.474853992462158 - response_length/mean:1056.3333740234375 - response_length/max:8192.0 - response_length/min:111.0 - response_length/clip_ratio:0.011029412038624287 - response_length_non_aborted/mean:1056.3333740234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:111.0 - response_length_non_aborted/clip_ratio:0.011029412038624287 - response/aborted_ratio:0.0 - prompt_length/mean:240.9705810546875 - prompt_length/max:962.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.056964725255966e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9411804657429457) - timing_s/agent_loop/generate_sequences/max:np.float64(27.797374293208122) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.078128052049578) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.797374293208122) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.646640135906637 - timing_s/reward:0.00013721268624067307 - timing_s/old_log_prob:9.874522815458477 - timing_s/ref:13.466041510924697 - timing_s/adv:0.08338631689548492 - timing_s/update_actor:22.824306496419013 - timing_s/update_weights:22.960690919309855 - timing_s/step:100.24525874760002 - timing_s/stop_profile:5.0634145736694336e-05 - timing_per_token_ms/adv:7.877037303559884e-05 - timing_per_token_ms/update_actor:0.02156084120198282 - timing_per_token_ms/gen:0.03555426667336448 - timing_per_token_ms/ref:0.012720613556513034 - perf/total_num_tokens:1405922 - perf/time_per_step:100.24525874760002 - perf/throughput:3506.2057237536415 - frontier/active_count:42.0 - frontier/completed_count:22.0 - frontier/blacklisted_count:1330.0 - frontier/mean_score:2.5417515693849335 - frontier/mean_frontier_pct:0.5579001402050429 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.4672047624386995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.2116299334715093 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.371752475099999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.6083398950999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:2.4684155871079425 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.2769509108999992 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.896225346569999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.2823509999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.9903110989592996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.405964053509999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:2.4346167990715095 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.34442681 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:3.333681263159299 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.861632130495073 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.2268985189592994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.2429287010999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9932177692715096 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0765509999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.8520131823699995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.945637926569999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2632101978589994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.7471672277389994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.5648043929999993 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:1.6802351989999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.7744267325699994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.370934868918556 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:2.663346237400999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3861587127989994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.2450161325699995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.549210197858999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.2352130989592993 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:1.9153918129999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:4.0107604989593 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.6162338129999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.857217151307942 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.8361667325699993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:198.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.023111216391800905 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.020717152767053423 - cluster/prob_snapshot/cluster_4:0.02213509197341395 - cluster/prob_snapshot/cluster_5:0.014589189566297305 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.022217079633732827 - cluster/prob_snapshot/cluster_8:0.0150659126121665 - cluster/prob_snapshot/cluster_9:0.02312255863277313 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.021329038439153687 - cluster/prob_snapshot/cluster_12:0.027130010335192398 - cluster/prob_snapshot/cluster_13:0.02137962306416136 - cluster/prob_snapshot/cluster_14:0.02801134625670108 - cluster/prob_snapshot/cluster_15:0.02253755209867612 - cluster/prob_snapshot/cluster_16:0.02280595292740861 - cluster/prob_snapshot/cluster_17:0.0174453282572067 - cluster/prob_snapshot/cluster_18:0.03132847864481652 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03122782114688019 - cluster/prob_snapshot/cluster_21:0.026805962929574677 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.020860179235134748 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.021010339859779254 - cluster/prob_snapshot/cluster_28:0.02803857410887931 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.019451818608753578 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.017348490109071097 - cluster/prob_snapshot/cluster_33:0.027592876184935805 - cluster/prob_snapshot/cluster_34:0.030567692703424163 - cluster/prob_snapshot/cluster_35:0.0351010965830577 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.024025468105319987 - cluster/prob_snapshot/cluster_38:0.01573938242354316 - cluster/prob_snapshot/cluster_39:0.025989077824348412 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02220942081527143 - cluster/prob_snapshot/cluster_42:0.024948545883164242 - cluster/prob_snapshot/cluster_43:0.022352028172225518 - cluster/prob_snapshot/cluster_44:0.021029893599760912 - cluster/prob_snapshot/cluster_45:0.023879391531601235 - cluster/prob_snapshot/cluster_46:0.026527732125378824 - cluster/prob_snapshot/cluster_47:0.030305433558636338 - cluster/prob_snapshot/cluster_48:0.017942181102784213 - cluster/prob_snapshot/cluster_49:0.03757027191189155 - cluster/prob_snapshot/cluster_50:0.017858795412359994 - cluster/prob_snapshot/cluster_51:0.024507226438726396 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.026764606192220528 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.026567419160970746 - cluster/prob_snapshot/cluster_58:0.020852742671751582 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 17:49:24,509:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  25%|██▍       | 199/800 [6:18:49<18:23:48, 110.20s/it]
[36m(TaskRunner pid=2823680)[0m step:199 - global_seqlen/min:338184 - global_seqlen/max:434722 - global_seqlen/minmax_diff:96538 - global_seqlen/balanced_min:377884 - global_seqlen/balanced_max:378131 - global_seqlen/mean:377995.0 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.20565307318853834) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011793941259384155 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.011543635366251692) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006474129161233577) - actor/ppo_kl:np.float64(-9.871223580641224e-05) - actor/pg_clipfrac_lower:np.float64(1.9550852509079657e-05) - actor/grad_norm:np.float64(0.2595456304649512) - perf/mfu/actor:np.float64(0.21900393243849775) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.04936599731445) - actor/lr:np.float64(1e-06) - training/global_step:199 - training/epoch:0 - critic/score/mean:0.6471354365348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6366678476333618 - critic/rewards/max:1.0261329412460327 - critic/rewards/min:-0.11339906603097916 - critic/advantages/mean:-0.13754993677139282 - critic/advantages/max:2.474804639816284 - critic/advantages/min:-2.4748294353485107 - critic/returns/mean:-0.13754993677139282 - critic/returns/max:2.474804639816284 - critic/returns/min:-2.4748294353485107 - response_length/mean:1151.1536865234375 - response_length/max:8192.0 - response_length/min:204.0 - response_length/clip_ratio:0.010416666977107525 - response_length_non_aborted/mean:1151.1536865234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:204.0 - response_length_non_aborted/clip_ratio:0.010416666977107525 - response/aborted_ratio:0.0 - prompt_length/mean:228.4791717529297 - prompt_length/max:355.0 - prompt_length/min:183.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.365100413560867e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6253753658384085) - timing_s/agent_loop/generate_sequences/max:np.float64(27.859237898141146) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.8548233202855045) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.859237898141146) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:270 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.598360530100763 - timing_s/reward:0.00014192424714565277 - timing_s/old_log_prob:9.666933081112802 - timing_s/ref:20.270089875906706 - timing_s/adv:0.07945810537785292 - timing_s/update_actor:20.216135527007282 - timing_s/update_weights:27.25656186323613 - timing_s/step:107.48839029110968 - timing_s/stop_profile:5.819927901029587e-05 - timing_per_token_ms/adv:7.499174691508432e-05 - timing_per_token_ms/update_actor:0.019079781877922003 - timing_per_token_ms/gen:0.03347905128019306 - timing_per_token_ms/ref:0.019130703440403175 - perf/total_num_tokens:1511980 - perf/time_per_step:107.48839029110968 - perf/throughput:3516.612342749576 - frontier/active_count:41.0 - frontier/completed_count:23.0 - frontier/blacklisted_count:1361.0 - frontier/mean_score:2.5639330065844326 - frontier/mean_frontier_pct:0.563174357039579 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:2.448140953430056 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.371752475099999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.6083398950999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.0278909109755596 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:1.8938656376299994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.896225346569999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.2823509999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.9932177692715096 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.405964053509999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:2.4346167990715095 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.2410987669999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:3.833576884211509 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.861632130495073 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.2268985189592994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.2429287010999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9932177692715096 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0765509999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:2.1964092276589993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.945637926569999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2632101978589994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.5230170594172994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.6953630750999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.0761646392999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.4420987127989995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.370934868918556 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:2.764342366180699 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3861587127989994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.2450161325699995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.549210197858999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.2352130989592993 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:1.9153918129999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:4.0107604989593 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.7313636690999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.9000520059155592 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.2853167127989993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2261046393 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:199.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.023288734459272372 - cluster/prob_snapshot/cluster_4:0.02247880353872478 - cluster/prob_snapshot/cluster_5:0.014815729089533589 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.022562064295496034 - cluster/prob_snapshot/cluster_8:0.015299854644708472 - cluster/prob_snapshot/cluster_9:0.01929096969351893 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.018016010832427636 - cluster/prob_snapshot/cluster_12:0.02755128250927724 - cluster/prob_snapshot/cluster_13:0.021711603781384696 - cluster/prob_snapshot/cluster_14:0.028473954373285787 - cluster/prob_snapshot/cluster_15:0.022887513025850695 - cluster/prob_snapshot/cluster_16:0.023160081556668412 - cluster/prob_snapshot/cluster_17:0.017716217625538565 - cluster/prob_snapshot/cluster_18:0.030832002722385156 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03646814288226289 - cluster/prob_snapshot/cluster_21:0.027222203326915443 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.021184094078867236 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.021336586383154485 - cluster/prob_snapshot/cluster_28:0.028473954373285787 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.019753864565020093 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.02089405481133681 - cluster/prob_snapshot/cluster_33:0.028021335695126378 - cluster/prob_snapshot/cluster_34:0.031042344876528006 - cluster/prob_snapshot/cluster_35:0.03351384187144197 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02564051503626999 - cluster/prob_snapshot/cluster_38:0.019750189183610704 - cluster/prob_snapshot/cluster_39:0.03274407986707836 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.022554286551643002 - cluster/prob_snapshot/cluster_42:0.026296702904422153 - cluster/prob_snapshot/cluster_43:0.02269910830183035 - cluster/prob_snapshot/cluster_44:0.02135644375171762 - cluster/prob_snapshot/cluster_45:0.024250188411589577 - cluster/prob_snapshot/cluster_46:0.026939652181726986 - cluster/prob_snapshot/cluster_47:0.030776013397128682 - cluster/prob_snapshot/cluster_48:0.01822078555400291 - cluster/prob_snapshot/cluster_49:0.038153659457039316 - cluster/prob_snapshot/cluster_50:0.018136105058650178 - cluster/prob_snapshot/cluster_51:0.025982982357388648 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.027587684846830548 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.021739816085811697 - cluster/prob_snapshot/cluster_58:0.02117654204124778 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_200
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 200}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  25%|██▌       | 200/800 [6:23:42<27:30:20, 165.03s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:200 - global_seqlen/min:317730 - global_seqlen/max:391338 - global_seqlen/minmax_diff:73608 - global_seqlen/balanced_min:356979 - global_seqlen/balanced_max:357180 - global_seqlen/mean:357046.75 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.21923957624369197) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012995453551411629 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.005073882661235984) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00038058877418936594) - actor/ppo_kl:np.float64(5.394556418549554e-05) - actor/pg_clipfrac_lower:np.float64(1.4778548979342708e-06) - actor/grad_norm:np.float64(0.225060040752093) - perf/mfu/actor:np.float64(0.2141122583787593) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(103.8946647644043) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.10833333333333334) - val-aux/aime2024/reward/std@16:np.float64(0.12925127327897407) - val-aux/aime2024/reward/best@2/mean:np.float64(0.15830000000000002) - val-aux/aime2024/reward/best@2/std:np.float64(0.12809619489840465) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.056333333333333326) - val-aux/aime2024/reward/worst@2/std:np.float64(0.09054029204347078) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.10859999999999999) - val-aux/aime2024/reward/maj@2/std:np.float64(0.1285771742257817) - val-aux/aime2024/reward/best@4/mean:np.float64(0.21036666666666667) - val-aux/aime2024/reward/best@4/std:np.float64(0.11193448395031844) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.021033333333333334) - val-aux/aime2024/reward/worst@4/std:np.float64(0.04855943896930762) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.1343333333333333) - val-aux/aime2024/reward/maj@4/std:np.float64(0.11641075670594618) - val-aux/aime2024/reward/best@8/mean:np.float64(0.25723333333333337) - val-aux/aime2024/reward/best@8/std:np.float64(0.08895443867692503) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.004333333333333333) - val-aux/aime2024/reward/worst@8/std:np.float64(0.019031689813162965) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.1599) - val-aux/aime2024/reward/maj@8/std:np.float64(0.09340466569883403) - val-aux/aime2024/reward/best@16/mean:np.float64(0.29733333333333334) - val-aux/aime2024/reward/best@16/std:np.float64(0.06514431237801423) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.00023333333333333333) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0036277966521263178) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.17626666666666665) - val-aux/aime2024/reward/maj@16/std:np.float64(0.06173809776128532) - val-aux/aime2024/score/mean@16:np.float64(0.10833333333333334) - val-aux/aime2024/score/std@16:np.float64(0.12925127327897407) - val-aux/aime2024/score/best@2/mean:np.float64(0.15830000000000002) - val-aux/aime2024/score/best@2/std:np.float64(0.12809619489840465) - val-aux/aime2024/score/worst@2/mean:np.float64(0.056333333333333326) - val-aux/aime2024/score/worst@2/std:np.float64(0.09054029204347078) - val-aux/aime2024/score/maj@2/mean:np.float64(0.10859999999999999) - val-aux/aime2024/score/maj@2/std:np.float64(0.1285771742257817) - val-aux/aime2024/score/best@4/mean:np.float64(0.21036666666666667) - val-aux/aime2024/score/best@4/std:np.float64(0.11193448395031844) - val-aux/aime2024/score/worst@4/mean:np.float64(0.021033333333333334) - val-aux/aime2024/score/worst@4/std:np.float64(0.04855943896930762) - val-aux/aime2024/score/maj@4/mean:np.float64(0.1343333333333333) - val-aux/aime2024/score/maj@4/std:np.float64(0.11641075670594618) - val-aux/aime2024/score/best@8/mean:np.float64(0.25723333333333337) - val-aux/aime2024/score/best@8/std:np.float64(0.08895443867692503) - val-aux/aime2024/score/worst@8/mean:np.float64(0.004333333333333333) - val-aux/aime2024/score/worst@8/std:np.float64(0.019031689813162965) - val-aux/aime2024/score/maj@8/mean:np.float64(0.1599) - val-aux/aime2024/score/maj@8/std:np.float64(0.09340466569883403) - val-aux/aime2024/score/best@16/mean:np.float64(0.29733333333333334) - val-aux/aime2024/score/best@16/std:np.float64(0.06514431237801423) - val-aux/aime2024/score/worst@16/mean:np.float64(0.00023333333333333333) - val-aux/aime2024/score/worst@16/std:np.float64(0.0036277966521263178) - val-aux/aime2024/score/maj@16/mean:np.float64(0.17626666666666665) - val-aux/aime2024/score/maj@16/std:np.float64(0.06173809776128532) - val-core/aime2024/acc/mean@16:np.float64(0.10833333333333334) - val-aux/aime2024/acc/std@16:np.float64(0.12925127327897407) - val-aux/aime2024/acc/best@2/mean:np.float64(0.15830000000000002) - val-aux/aime2024/acc/best@2/std:np.float64(0.12809619489840465) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.056333333333333326) - val-aux/aime2024/acc/worst@2/std:np.float64(0.09054029204347078) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.10859999999999999) - val-aux/aime2024/acc/maj@2/std:np.float64(0.1285771742257817) - val-aux/aime2024/acc/best@4/mean:np.float64(0.21036666666666667) - val-aux/aime2024/acc/best@4/std:np.float64(0.11193448395031844) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.021033333333333334) - val-aux/aime2024/acc/worst@4/std:np.float64(0.04855943896930762) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.1343333333333333) - val-aux/aime2024/acc/maj@4/std:np.float64(0.11641075670594618) - val-aux/aime2024/acc/best@8/mean:np.float64(0.25723333333333337) - val-aux/aime2024/acc/best@8/std:np.float64(0.08895443867692503) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.004333333333333333) - val-aux/aime2024/acc/worst@8/std:np.float64(0.019031689813162965) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.1599) - val-aux/aime2024/acc/maj@8/std:np.float64(0.09340466569883403) - val-core/aime2024/acc/best@16/mean:np.float64(0.29733333333333334) - val-core/aime2024/acc/best@16/std:np.float64(0.06514431237801423) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.00023333333333333333) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0036277966521263178) - val-core/aime2024/acc/maj@16/mean:np.float64(0.17626666666666665) - val-core/aime2024/acc/maj@16/std:np.float64(0.06173809776128532) - val-aux/aime2025/reward/mean@16:np.float64(0.05) - val-aux/aime2025/reward/std@16:np.float64(0.08760857962568232) - val-aux/aime2025/reward/best@2/mean:np.float64(0.0852) - val-aux/aime2025/reward/best@2/std:np.float64(0.10121636666598814) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0163) - val-aux/aime2025/reward/worst@2/std:np.float64(0.0496105929199105) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.05163333333333333) - val-aux/aime2025/reward/maj@2/std:np.float64(0.0878459117583282) - val-aux/aime2025/reward/best@4/mean:np.float64(0.13043333333333335) - val-aux/aime2025/reward/best@4/std:np.float64(0.09871412779277122) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0023333333333333335) - val-aux/aime2025/reward/worst@4/std:np.float64(0.0136869999632663) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.061399999999999996) - val-aux/aime2025/reward/maj@4/std:np.float64(0.08790049362804725) - val-aux/aime2025/reward/best@8/mean:np.float64(0.17149999999999999) - val-aux/aime2025/reward/best@8/std:np.float64(0.07981597799514689) - val-aux/aime2025/reward/worst@8/mean:np.float64(6.666666666666667e-05) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0014892205269125785) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.07676666666666666) - val-aux/aime2025/reward/maj@8/std:np.float64(0.08395578375393035) - val-aux/aime2025/reward/best@16/mean:np.float64(0.20339999999999997) - val-aux/aime2025/reward/best@16/std:np.float64(0.051701719981313926) - val-aux/aime2025/reward/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2025/reward/worst@16/std:np.float64(0.001053565375285274) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.08723333333333333) - val-aux/aime2025/reward/maj@16/std:np.float64(0.07133013379680679) - val-aux/aime2025/score/mean@16:np.float64(0.05) - val-aux/aime2025/score/std@16:np.float64(0.08760857962568232) - val-aux/aime2025/score/best@2/mean:np.float64(0.0852) - val-aux/aime2025/score/best@2/std:np.float64(0.10121636666598814) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0163) - val-aux/aime2025/score/worst@2/std:np.float64(0.0496105929199105) - val-aux/aime2025/score/maj@2/mean:np.float64(0.05163333333333333) - val-aux/aime2025/score/maj@2/std:np.float64(0.0878459117583282) - val-aux/aime2025/score/best@4/mean:np.float64(0.13043333333333335) - val-aux/aime2025/score/best@4/std:np.float64(0.09871412779277122) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0023333333333333335) - val-aux/aime2025/score/worst@4/std:np.float64(0.0136869999632663) - val-aux/aime2025/score/maj@4/mean:np.float64(0.061399999999999996) - val-aux/aime2025/score/maj@4/std:np.float64(0.08790049362804725) - val-aux/aime2025/score/best@8/mean:np.float64(0.17149999999999999) - val-aux/aime2025/score/best@8/std:np.float64(0.07981597799514689) - val-aux/aime2025/score/worst@8/mean:np.float64(6.666666666666667e-05) - val-aux/aime2025/score/worst@8/std:np.float64(0.0014892205269125785) - val-aux/aime2025/score/maj@8/mean:np.float64(0.07676666666666666) - val-aux/aime2025/score/maj@8/std:np.float64(0.08395578375393035) - val-aux/aime2025/score/best@16/mean:np.float64(0.20339999999999997) - val-aux/aime2025/score/best@16/std:np.float64(0.051701719981313926) - val-aux/aime2025/score/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2025/score/worst@16/std:np.float64(0.001053565375285274) - val-aux/aime2025/score/maj@16/mean:np.float64(0.08723333333333333) - val-aux/aime2025/score/maj@16/std:np.float64(0.07133013379680679) - val-core/aime2025/acc/mean@16:np.float64(0.05) - val-aux/aime2025/acc/std@16:np.float64(0.08760857962568232) - val-aux/aime2025/acc/best@2/mean:np.float64(0.0852) - val-aux/aime2025/acc/best@2/std:np.float64(0.10121636666598814) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0163) - val-aux/aime2025/acc/worst@2/std:np.float64(0.0496105929199105) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.05163333333333333) - val-aux/aime2025/acc/maj@2/std:np.float64(0.0878459117583282) - val-aux/aime2025/acc/best@4/mean:np.float64(0.13043333333333335) - val-aux/aime2025/acc/best@4/std:np.float64(0.09871412779277122) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0023333333333333335) - val-aux/aime2025/acc/worst@4/std:np.float64(0.0136869999632663) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.061399999999999996) - val-aux/aime2025/acc/maj@4/std:np.float64(0.08790049362804725) - val-aux/aime2025/acc/best@8/mean:np.float64(0.17149999999999999) - val-aux/aime2025/acc/best@8/std:np.float64(0.07981597799514689) - val-aux/aime2025/acc/worst@8/mean:np.float64(6.666666666666667e-05) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0014892205269125785) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.07676666666666666) - val-aux/aime2025/acc/maj@8/std:np.float64(0.08395578375393035) - val-core/aime2025/acc/best@16/mean:np.float64(0.20339999999999997) - val-core/aime2025/acc/best@16/std:np.float64(0.051701719981313926) - val-aux/aime2025/acc/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2025/acc/worst@16/std:np.float64(0.001053565375285274) - val-core/aime2025/acc/maj@16/mean:np.float64(0.08723333333333333) - val-core/aime2025/acc/maj@16/std:np.float64(0.07133013379680679) - val-aux/math500/reward/mean@4:np.float64(0.688) - val-aux/math500/reward/std@4:np.float64(0.13279869280115048) - val-aux/math500/reward/best@2/mean:np.float64(0.7483) - val-aux/math500/reward/best@2/std:np.float64(0.11122158050608053) - val-aux/math500/reward/worst@2/mean:np.float64(0.629048) - val-aux/math500/reward/worst@2/std:np.float64(0.11715494174465631) - val-aux/math500/reward/maj@2/mean:np.float64(0.688994) - val-aux/math500/reward/maj@2/std:np.float64(0.13278742583443306) - val-aux/math500/reward/best@4/mean:np.float64(0.793058) - val-aux/math500/reward/best@4/std:np.float64(0.0718632497550797) - val-aux/math500/reward/worst@4/mean:np.float64(0.579478) - val-aux/math500/reward/worst@4/std:np.float64(0.08002734966216196) - val-aux/math500/reward/maj@4/mean:np.float64(0.702952) - val-aux/math500/reward/maj@4/std:np.float64(0.12152187296349416) - val-aux/math500/score/mean@4:np.float64(0.688) - val-aux/math500/score/std@4:np.float64(0.13279869280115048) - val-aux/math500/score/best@2/mean:np.float64(0.7483) - val-aux/math500/score/best@2/std:np.float64(0.11122158050608053) - val-aux/math500/score/worst@2/mean:np.float64(0.629048) - val-aux/math500/score/worst@2/std:np.float64(0.11715494174465631) - val-aux/math500/score/maj@2/mean:np.float64(0.688994) - val-aux/math500/score/maj@2/std:np.float64(0.13278742583443306) - val-aux/math500/score/best@4/mean:np.float64(0.793058) - val-aux/math500/score/best@4/std:np.float64(0.0718632497550797) - val-aux/math500/score/worst@4/mean:np.float64(0.579478) - val-aux/math500/score/worst@4/std:np.float64(0.08002734966216196) - val-aux/math500/score/maj@4/mean:np.float64(0.702952) - val-aux/math500/score/maj@4/std:np.float64(0.12152187296349416) - val-core/math500/acc/mean@4:np.float64(0.688) - val-aux/math500/acc/std@4:np.float64(0.13279869280115048) - val-aux/math500/acc/best@2/mean:np.float64(0.7483) - val-aux/math500/acc/best@2/std:np.float64(0.11122158050608053) - val-aux/math500/acc/worst@2/mean:np.float64(0.629048) - val-aux/math500/acc/worst@2/std:np.float64(0.11715494174465631) - val-aux/math500/acc/maj@2/mean:np.float64(0.688994) - val-aux/math500/acc/maj@2/std:np.float64(0.13278742583443306) - val-core/math500/acc/best@4/mean:np.float64(0.793058) - val-core/math500/acc/best@4/std:np.float64(0.0718632497550797) - val-aux/math500/acc/worst@4/mean:np.float64(0.579478) - val-aux/math500/acc/worst@4/std:np.float64(0.08002734966216196) - val-core/math500/acc/maj@4/mean:np.float64(0.702952) - val-core/math500/acc/maj@4/std:np.float64(0.12152187296349416) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.07094594594594594 - val-aux/aime2024/response_length/clip_ratio:0.175 - val-aux/aime2025/response_length/clip_ratio:0.11041666666666666 - val-aux/math500/response_length/clip_ratio:0.0365 - training/global_step:200 - training/epoch:0 - critic/score/mean:0.6611111164093018 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6515573859214783 - critic/rewards/max:1.0134663581848145 - critic/rewards/min:-0.049987103790044785 - critic/advantages/mean:-0.12061826139688492 - critic/advantages/max:2.4747841358184814 - critic/advantages/min:-2.4748435020446777 - critic/returns/mean:-0.12061826139688492 - critic/returns/max:2.4747841358184814 - critic/returns/min:-2.4748435020446777 - response_length/mean:1059.615234375 - response_length/max:8192.0 - response_length/min:112.0 - response_length/clip_ratio:0.011111111380159855 - response_length_non_aborted/mean:1059.615234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:112.0 - response_length_non_aborted/clip_ratio:0.011111111380159855 - response/aborted_ratio:0.0 - prompt_length/mean:229.45555114746094 - prompt_length/max:360.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.569657802581787e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1554452879354358) - timing_s/agent_loop/generate_sequences/max:np.float64(28.71227553114295) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.399288857575812) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.71227553114295) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:197 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.509594515897334 - timing_s/reward:0.0001585204154253006 - timing_s/old_log_prob:9.68763198517263 - timing_s/ref:18.953248417936265 - timing_s/adv:0.0648733638226986 - timing_s/update_actor:19.543495774269104 - timing_s/save_checkpoint:53.92355306074023 - timing_s/update_weights:25.996352937072515 - timing_s/step:159.16243922058493 - timing_s/testing:133.60052424203604 - timing_s/stop_profile:0.0003941338509321213 - timing_per_token_ms/adv:6.989677515641498e-05 - timing_per_token_ms/update_actor:0.021056829019038374 - timing_per_token_ms/gen:0.039990398134408495 - timing_per_token_ms/ref:0.020420876382683334 - perf/total_num_tokens:1428187 - perf/time_per_step:159.16243922058493 - perf/throughput:2243.285235816002 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1397.0 - frontier/mean_score:2.58937541110418 - frontier/mean_frontier_pct:0.5546162755501972 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:2.613698667401039 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:2.0258379265699995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.0278909109755596 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:1.8938656376299994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.896225346569999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.8976456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:2.9952524384900565 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.5841748374569993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.6036456999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.1687691368999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:3.5835038189480564 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.861632130495073 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.2268985189592994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.2429287010999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9932177692715096 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.0765509999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:2.1964092276589993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:2.9619465485989993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2632101978589994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.3661119415921092 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.6953630750999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.9533152475099995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.4420987127989995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:2.764342366180699 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.5703110989592997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.2450161325699995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.549210197858999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.2352130989592993 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:1.9153918129999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:4.0107604989593 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.7313636690999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.9000520059155592 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.2853167127989993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:200.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02656298641904705 - cluster/prob_snapshot/cluster_4:0.024015139040730573 - cluster/prob_snapshot/cluster_5:0.015828324379542593 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.019921759430395907 - cluster/prob_snapshot/cluster_8:0.02058856516316711 - cluster/prob_snapshot/cluster_9:0.020609429617652077 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.019247332463879097 - cluster/prob_snapshot/cluster_12:0.02943430147742981 - cluster/prob_snapshot/cluster_13:0.019285749189819932 - cluster/prob_snapshot/cluster_14:0.03044071255709946 - cluster/prob_snapshot/cluster_15:0.026262936109643326 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.016297830917295687 - cluster/prob_snapshot/cluster_18:0.032204160563110165 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03641910387855548 - cluster/prob_snapshot/cluster_21:0.029082731061050044 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.022631941414475065 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.02279485595233251 - cluster/prob_snapshot/cluster_28:0.0304200342396279 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.021103961485471057 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.022322079133548257 - cluster/prob_snapshot/cluster_33:0.030102225220405003 - cluster/prob_snapshot/cluster_34:0.03316396386826653 - cluster/prob_snapshot/cluster_35:0.034209752985187504 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.027392940759110293 - cluster/prob_snapshot/cluster_38:0.030014505416869352 - cluster/prob_snapshot/cluster_39:0.034982005577565686 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.028093976419810263 - cluster/prob_snapshot/cluster_43:0.02612203911106246 - cluster/prob_snapshot/cluster_44:0.022816070491896646 - cluster/prob_snapshot/cluster_45:0.025907590920707647 - cluster/prob_snapshot/cluster_46:0.028780868685406743 - cluster/prob_snapshot/cluster_47:0.03287942970710979 - cluster/prob_snapshot/cluster_48:0.019466102711245038 - cluster/prob_snapshot/cluster_49:0.04076130810054064 - cluster/prob_snapshot/cluster_50:0.01937563464578815 - cluster/prob_snapshot/cluster_51:0.027758814339499158 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.029473191767842476 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.023225644777813508 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  25%|██▌       | 201/800 [6:25:42<25:12:57, 151.55s/it]
[36m(TaskRunner pid=2823680)[0m step:201 - global_seqlen/min:330564 - global_seqlen/max:480436 - global_seqlen/minmax_diff:149872 - global_seqlen/balanced_min:422747 - global_seqlen/balanced_max:422801 - global_seqlen/mean:422781.0 - frontier/skipped_zero_acc_count:20.0 - actor/entropy:np.float64(0.22834346150220544) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012343459762632847 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05454262741841376) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00039381067553121183) - actor/ppo_kl:np.float64(4.402215703673088e-05) - actor/pg_clipfrac_lower:np.float64(1.2845268126208491e-06) - actor/grad_norm:np.float64(0.23149167320558003) - perf/mfu/actor:np.float64(0.1849547005225297) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.9336929321289) - actor/lr:np.float64(1e-06) - training/global_step:201 - training/epoch:0 - critic/score/mean:0.5659722089767456 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5550220012664795 - critic/rewards/max:1.027099370956421 - critic/rewards/min:-0.1650267392396927 - critic/advantages/mean:-0.15544231235980988 - critic/advantages/max:2.474731683731079 - critic/advantages/min:-2.4748613834381104 - critic/returns/mean:-0.15544231235980988 - critic/returns/max:2.474731683731079 - critic/returns/min:-2.4748613834381104 - response_length/mean:1275.0821533203125 - response_length/max:8192.0 - response_length/min:155.0 - response_length/clip_ratio:0.021990740671753883 - response_length_non_aborted/mean:1275.0821533203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:155.0 - response_length_non_aborted/clip_ratio:0.021990740671753883 - response/aborted_ratio:0.0 - prompt_length/mean:247.3333282470703 - prompt_length/max:886.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.316896855831146e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2130015650764108) - timing_s/agent_loop/generate_sequences/max:np.float64(30.828988544642925) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.328326633635697) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.828988544642925) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.85656216181815 - timing_s/reward:0.00017823558300733566 - timing_s/old_log_prob:11.920193456113338 - timing_s/ref:18.852206180803478 - timing_s/adv:0.10331272147595882 - timing_s/update_actor:27.136024369858205 - timing_s/update_weights:27.562113415449858 - timing_s/step:119.83710471820086 - timing_s/stop_profile:5.901884287595749e-05 - timing_per_token_ms/adv:7.854288687184552e-05 - timing_per_token_ms/update_actor:0.020630002402263554 - timing_per_token_ms/gen:0.030732008160165917 - timing_per_token_ms/ref:0.014332278505393155 - perf/total_num_tokens:1691124 - perf/time_per_step:119.83710471820086 - perf/throughput:3527.964072514746 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1417.0 - frontier/mean_score:2.604698602139219 - frontier/mean_frontier_pct:0.5811932653919916 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:2.613698667401039 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.5574480099999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:2.0258379265699995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.0278909109755596 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:1.6257059463409995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.9273577425989994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.8976456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:2.9966767069430396 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.5841748374569993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.6036456999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.1687691368999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:3.408452673263639 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.861632130495073 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.2268985189592994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.2429287010999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9932177692715096 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.353585699999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.8374864593612994 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.9733625840192994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2632101978589994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.8562783591144765 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.6953630750999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.9533152475099995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.4420987127989995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:2.764342366180699 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.6992177692715096 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.4715112927989997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.684447138501299 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.8823509999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.164649169271509 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:4.30753234927151 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.811954568369999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.9000520059155592 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.2853167127989993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:201.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02640671892804986 - cluster/prob_snapshot/cluster_4:0.023873860290493636 - cluster/prob_snapshot/cluster_5:0.01573520787153929 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.01980456163820017 - cluster/prob_snapshot/cluster_8:0.020467444617061143 - cluster/prob_snapshot/cluster_9:0.020488186327969695 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0164248314161531 - cluster/prob_snapshot/cluster_12:0.029575679122769087 - cluster/prob_snapshot/cluster_13:0.01917229298461955 - cluster/prob_snapshot/cluster_14:0.030276022444915226 - cluster/prob_snapshot/cluster_15:0.02610843378572049 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.01620195234754586 - cluster/prob_snapshot/cluster_18:0.03201470658788773 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03443627715965116 - cluster/prob_snapshot/cluster_21:0.028911640154983938 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.022498799882666745 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.022660756009987203 - cluster/prob_snapshot/cluster_28:0.03024107610774907 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.023778745739950764 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.01856449217793715 - cluster/prob_snapshot/cluster_33:0.03004047521111211 - cluster/prob_snapshot/cluster_34:0.03296886346263219 - cluster/prob_snapshot/cluster_35:0.038960749380767275 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.027231790726657937 - cluster/prob_snapshot/cluster_38:0.029837932972001005 - cluster/prob_snapshot/cluster_39:0.034776209807639145 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.027928702262078133 - cluster/prob_snapshot/cluster_43:0.027270735470675378 - cluster/prob_snapshot/cluster_44:0.02497017152376667 - cluster/prob_snapshot/cluster_45:0.027121504842063243 - cluster/prob_snapshot/cluster_46:0.02912096702588431 - cluster/prob_snapshot/cluster_47:0.03197308173322673 - cluster/prob_snapshot/cluster_48:0.0165770696863014 - cluster/prob_snapshot/cluster_49:0.04351985844404988 - cluster/prob_snapshot/cluster_50:0.01926164965310659 - cluster/prob_snapshot/cluster_51:0.02840973783685176 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.029299803819041008 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.023089010546294326 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  25%|██▌       | 202/800 [6:27:39<23:24:58, 140.97s/it]
[36m(TaskRunner pid=2823680)[0m step:202 - global_seqlen/min:334476 - global_seqlen/max:446757 - global_seqlen/minmax_diff:112281 - global_seqlen/balanced_min:401070 - global_seqlen/balanced_max:401131 - global_seqlen/mean:401090.5 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.24346779982502365) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014219522476196289 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02136974695167737) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004955460689397653) - actor/ppo_kl:np.float64(1.5995005947977877e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.22524644033266947) - perf/mfu/actor:np.float64(0.21759547316092792) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.67120361328125) - actor/lr:np.float64(1e-06) - training/global_step:202 - training/epoch:0 - critic/score/mean:0.5863401889801025 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5747764110565186 - critic/rewards/max:1.038748025894165 - critic/rewards/min:-0.06371919065713882 - critic/advantages/mean:-0.13481645286083221 - critic/advantages/max:2.4748215675354004 - critic/advantages/min:-2.474838972091675 - critic/returns/mean:-0.13481645286083221 - critic/returns/max:2.4748215675354004 - critic/returns/min:-2.474838972091675 - response_length/mean:1254.1043701171875 - response_length/max:8192.0 - response_length/min:66.0 - response_length/clip_ratio:0.019329896196722984 - response_length_non_aborted/mean:1254.1043701171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:66.0 - response_length_non_aborted/clip_ratio:0.019329896196722984 - response/aborted_ratio:0.0 - prompt_length/mean:242.9690704345703 - prompt_length/max:728.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.620880544185638e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.7870511189103127) - timing_s/agent_loop/generate_sequences/max:np.float64(29.88412242103368) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.049090442619672) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.88412242103368) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.678361882455647 - timing_s/reward:0.00017562508583068848 - timing_s/old_log_prob:10.14696685411036 - timing_s/ref:22.869397275149822 - timing_s/adv:0.08671411499381065 - timing_s/update_actor:21.977942783385515 - timing_s/update_weights:28.8903567828238 - timing_s/step:116.05695071164519 - timing_s/stop_profile:6.151292473077774e-05 - timing_per_token_ms/adv:7.464229178561494e-05 - timing_per_token_ms/update_actor:0.018918304340672838 - timing_per_token_ms/gen:0.032551222925194745 - timing_per_token_ms/ref:0.019685655841551533 - perf/total_num_tokens:1604362 - perf/time_per_step:116.05695071164519 - perf/throughput:3455.9799955157227 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1448.0 - frontier/mean_score:2.592306038642532 - frontier/mean_frontier_pct:0.5954753964956957 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:2.613698667401039 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.9902136069999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:2.0258379265699995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.0278909109755596 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:1.6257059463409995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.9491504198192993 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.8976456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:2.9976736948601275 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.1089223862198994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.6036456999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.1687691368999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:3.408452673263639 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.861632130495073 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.4588289632715092 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.470050090769999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9952524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:1.9475099899999992 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.8374864593612994 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:2.9813538088135094 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2632101978589994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.8562783591144765 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:2.1867541525699994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.9673206732569994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.3094690989592994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:128.0 - frontier/cluster_42/score:2.8350396563264892 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.6992177692715096 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.4715112927989997 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.684447138501299 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.9176456999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.115254418490056 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:4.30753234927151 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.868368197858999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.9000520059155592 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.2853167127989993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:202.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02653295670097361 - cluster/prob_snapshot/cluster_4:0.023987989689241596 - cluster/prob_snapshot/cluster_5:0.02020364937964635 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.01989923768491955 - cluster/prob_snapshot/cluster_8:0.020565289587234754 - cluster/prob_snapshot/cluster_9:0.020586130454248192 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.01650335060455635 - cluster/prob_snapshot/cluster_12:0.029938294482711757 - cluster/prob_snapshot/cluster_13:0.01926394646019198 - cluster/prob_snapshot/cluster_14:0.030430878410501697 - cluster/prob_snapshot/cluster_15:0.02140872131022164 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.01627940605873746 - cluster/prob_snapshot/cluster_18:0.03216775344204155 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.034600900373472764 - cluster/prob_snapshot/cluster_21:0.029049852746812647 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02496079721479711 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.025074708467753285 - cluster/prob_snapshot/cluster_28:0.03040629903139028 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.019770143698609813 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.018653240051324547 - cluster/prob_snapshot/cluster_33:0.030265207120524603 - cluster/prob_snapshot/cluster_34:0.03312647168009721 - cluster/prob_snapshot/cluster_35:0.03914700191167312 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0221988303278694 - cluster/prob_snapshot/cluster_38:0.030122749773492316 - cluster/prob_snapshot/cluster_39:0.03359606882043985 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.02877989929942906 - cluster/prob_snapshot/cluster_43:0.027401103689506 - cluster/prob_snapshot/cluster_44:0.02508954185717587 - cluster/prob_snapshot/cluster_45:0.027251159661313275 - cluster/prob_snapshot/cluster_46:0.02961847438360562 - cluster/prob_snapshot/cluster_47:0.03162449888705197 - cluster/prob_snapshot/cluster_48:0.01665631665236721 - cluster/prob_snapshot/cluster_49:0.04372790587405824 - cluster/prob_snapshot/cluster_50:0.019353730299765906 - cluster/prob_snapshot/cluster_51:0.02911823392094379 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02943987203392588 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.023199387957373634 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  25%|██▌       | 203/800 [6:29:36<22:12:11, 133.89s/it]
[36m(TaskRunner pid=2823680)[0m step:203 - global_seqlen/min:342957 - global_seqlen/max:589452 - global_seqlen/minmax_diff:246495 - global_seqlen/balanced_min:433517 - global_seqlen/balanced_max:433592 - global_seqlen/mean:433556.25 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.22573514387983343) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012226446531713009 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04865969845559448) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006375407636211173) - actor/ppo_kl:np.float64(5.865380721188459e-05) - actor/pg_clipfrac_lower:np.float64(6.275005706930128e-07) - actor/grad_norm:np.float64(0.23700150474905968) - perf/mfu/actor:np.float64(0.24866846674740964) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.23884963989258) - actor/lr:np.float64(1e-06) - training/global_step:203 - training/epoch:0 - critic/score/mean:0.5978260636329651 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5869944095611572 - critic/rewards/max:1.0152790546417236 - critic/rewards/min:-0.07277340441942215 - critic/advantages/mean:-0.16514666378498077 - critic/advantages/max:2.4748265743255615 - critic/advantages/min:-2.474851608276367 - critic/returns/mean:-0.16514666378498077 - critic/returns/max:2.4748265743255615 - critic/returns/min:-2.474851608276367 - response_length/mean:1268.3233642578125 - response_length/max:8192.0 - response_length/min:186.0 - response_length/clip_ratio:0.02038043551146984 - response_length_non_aborted/mean:1268.3233642578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:186.0 - response_length_non_aborted/clip_ratio:0.02038043551146984 - response/aborted_ratio:0.0 - prompt_length/mean:244.61956787109375 - prompt_length/max:409.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.27504152059555e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6903182435780764) - timing_s/agent_loop/generate_sequences/max:np.float64(30.844143277965486) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.861653925470819) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.844143277965486) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:199 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.911853979341686 - timing_s/reward:0.00022504106163978577 - timing_s/old_log_prob:10.59551713988185 - timing_s/ref:23.08580810111016 - timing_s/adv:0.07984567526727915 - timing_s/update_actor:20.78059522435069 - timing_s/update_weights:28.948133145458996 - timing_s/step:117.10611394513398 - timing_s/stop_profile:5.575455725193024e-05 - timing_per_token_ms/adv:7.170526352081509e-05 - timing_per_token_ms/update_actor:0.0186619757637906 - timing_per_token_ms/gen:0.03525693366514515 - timing_per_token_ms/ref:0.020732167997074304 - perf/total_num_tokens:1734225 - perf/time_per_step:117.10611394513398 - perf/throughput:3702.2511924793935 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1484.0 - frontier/mean_score:2.5927185996199373 - frontier/mean_frontier_pct:0.6167912695083443 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:2.613698667401039 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.9902136069999998 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:2.0258379265699995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.3195236376828916 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:1.6257059463409995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:2.9644052938735093 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.8976456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:2.9983715864020892 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.1089223862198994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.6036456999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:3.7181383958299996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:3.285916871284547 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.861632130495073 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.6211802742900563 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.470050090769999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9952524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.2632569929999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.8374864593612994 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:2.9869476661694563 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.5842471385012993 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.8562783591144765 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:2.4307279067989995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.9673206732569994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.3094690989592994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:128.0 - frontier/cluster_42/score:2.8350396563264892 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.6992177692715096 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.6300579049592994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.684447138501299 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9423519899999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.080678092943039 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:4.30753234927151 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.9064900932009996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.907857738501299 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.330036404140891 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.8997216989592995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:203.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.026528734699190763 - cluster/prob_snapshot/cluster_4:0.02398417265006363 - cluster/prob_snapshot/cluster_5:0.020200434515782434 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.01989607125993609 - cluster/prob_snapshot/cluster_8:0.020562017178121797 - cluster/prob_snapshot/cluster_9:0.02354289267545075 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.016500724543071357 - cluster/prob_snapshot/cluster_12:0.03008836579476296 - cluster/prob_snapshot/cluster_13:0.019260881124608907 - cluster/prob_snapshot/cluster_14:0.03043311967723779 - cluster/prob_snapshot/cluster_15:0.021405314691782584 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.016276815631964512 - cluster/prob_snapshot/cluster_18:0.03773866831249147 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03335167056904168 - cluster/prob_snapshot/cluster_21:0.02904523024915923 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02660467213098344 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.025070718506700385 - cluster/prob_snapshot/cluster_28:0.030401460692031156 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02297179283604564 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.01865027189313396 - cluster/prob_snapshot/cluster_33:0.03031716822772445 - cluster/prob_snapshot/cluster_34:0.02622980513764201 - cluster/prob_snapshot/cluster_35:0.039140772726070965 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02467160295471646 - cluster/prob_snapshot/cluster_38:0.030117956553321598 - cluster/prob_snapshot/cluster_39:0.033590722915569346 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.02877531975755886 - cluster/prob_snapshot/cluster_43:0.027396743545631617 - cluster/prob_snapshot/cluster_44:0.026694779040291362 - cluster/prob_snapshot/cluster_45:0.027246823376971832 - cluster/prob_snapshot/cluster_46:0.02986452734888629 - cluster/prob_snapshot/cluster_47:0.031268521058152615 - cluster/prob_snapshot/cluster_48:0.016653666250476668 - cluster/prob_snapshot/cluster_49:0.04372094776160055 - cluster/prob_snapshot/cluster_50:0.01935065067751531 - cluster/prob_snapshot/cluster_51:0.02951441474483231 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.023649596021095464 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.01928195227038174 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▌       | 204/800 [6:31:22<20:46:35, 125.50s/it]
[36m(TaskRunner pid=2823680)[0m step:204 - global_seqlen/min:315826 - global_seqlen/max:451997 - global_seqlen/minmax_diff:136171 - global_seqlen/balanced_min:384358 - global_seqlen/balanced_max:384367 - global_seqlen/mean:384363.25 - frontier/skipped_zero_acc_count:22.0 - actor/entropy:np.float64(0.21253178707974138) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012015936896204948 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07908951189892832) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004926071541942876) - actor/ppo_kl:np.float64(5.954743835874704e-05) - actor/pg_clipfrac_lower:np.float64(7.592115132797087e-07) - actor/grad_norm:np.float64(0.2117018359048026) - perf/mfu/actor:np.float64(0.16087501317937652) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.50408840179443) - actor/lr:np.float64(1e-06) - training/global_step:204 - training/epoch:0 - critic/score/mean:0.6426886916160583 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6329783797264099 - critic/rewards/max:1.0034794807434082 - critic/rewards/min:-0.07042521238327026 - critic/advantages/mean:-0.15381672978401184 - critic/advantages/max:2.474740505218506 - critic/advantages/min:-2.4748382568359375 - critic/returns/mean:-0.15381672978401184 - critic/returns/max:2.474740505218506 - critic/returns/min:-2.4748382568359375 - response_length/mean:1153.5400390625 - response_length/max:8192.0 - response_length/min:153.0 - response_length/clip_ratio:0.012971698306500912 - response_length_non_aborted/mean:1153.5400390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:153.0 - response_length_non_aborted/clip_ratio:0.012971698306500912 - response/aborted_ratio:0.0 - prompt_length/mean:230.3773651123047 - prompt_length/max:411.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.042304009199142e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2746053077280521) - timing_s/agent_loop/generate_sequences/max:np.float64(28.781846965663135) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.878630276400145) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.781846965663135) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.431043840944767 - timing_s/reward:0.00022552907466888428 - timing_s/old_log_prob:11.32426327560097 - timing_s/ref:11.910960990004241 - timing_s/adv:0.09358107298612595 - timing_s/update_actor:28.023619073443115 - timing_s/update_weights:23.479801737703383 - timing_s/step:105.67106559500098 - timing_s/stop_profile:6.760843098163605e-05 - timing_per_token_ms/adv:7.974105585058647e-05 - timing_per_token_ms/update_actor:0.023879112542365136 - timing_per_token_ms/gen:0.03110916133983039 - timing_per_token_ms/ref:0.01014940922593288 - perf/total_num_tokens:1537453 - perf/time_per_step:105.67106559500098 - perf/throughput:3637.3556738144903 - frontier/active_count:36.0 - frontier/completed_count:28.0 - frontier/blacklisted_count:1505.0 - frontier/mean_score:2.6087438461001167 - frontier/mean_frontier_pct:0.6078367472403867 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.2931495248999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.318086548598999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.3195236376828916 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:1.6257059463409995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:2.975083705711456 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.8976456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:2.9988601104814623 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.3762456703539296 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.6036456999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:4.102696877081 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.903142491346551 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.6211802742900563 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.470050090769999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9952524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.2632569929999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.8374864593612994 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:2.990863366318619 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.5842471385012993 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:4.199394851380133 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:2.4307279067989995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.9673206732569994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.3094690989592994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:128.0 - frontier/cluster_42/score:2.8350396563264892 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.7894524384900565 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.6300579049592994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.684447138501299 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.080678092943039 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.9152726444900563 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:160.0 - frontier/cluster_50/score:1.6345430652406996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.907857738501299 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.531025482898624 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.8997216989592995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:204.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.025161109239228013 - cluster/prob_snapshot/cluster_5:0.02441734477270111 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.020872399048602948 - cluster/prob_snapshot/cluster_8:0.024682872989963803 - cluster/prob_snapshot/cluster_9:0.02469817504473584 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.017310437963075943 - cluster/prob_snapshot/cluster_12:0.03167854681136405 - cluster/prob_snapshot/cluster_13:0.02020603925308985 - cluster/prob_snapshot/cluster_14:0.03193171681463711 - cluster/prob_snapshot/cluster_15:0.02530214849387128 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.017075541531408493 - cluster/prob_snapshot/cluster_18:0.043685317096773035 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03091252025468267 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02791020026116029 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.02630097340678334 - cluster/prob_snapshot/cluster_28:0.03189330249848177 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02409905054478193 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.019565466580445185 - cluster/prob_snapshot/cluster_33:0.03184656787115871 - cluster/prob_snapshot/cluster_34:0.02751693802496556 - cluster/prob_snapshot/cluster_35:0.04471495243090447 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.025882272701569908 - cluster/prob_snapshot/cluster_38:0.031595886418806174 - cluster/prob_snapshot/cluster_39:0.03523906623900316 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.03018736457485157 - cluster/prob_snapshot/cluster_43:0.02970195409330482 - cluster/prob_snapshot/cluster_44:0.02800472884137796 - cluster/prob_snapshot/cluster_45:0.028583862758681442 - cluster/prob_snapshot/cluster_46:0.03151417105533651 - cluster/prob_snapshot/cluster_47:0.03280291071833959 - cluster/prob_snapshot/cluster_48:0.01747088473200999 - cluster/prob_snapshot/cluster_49:0.04168963297053554 - cluster/prob_snapshot/cluster_50:0.017404535175938998 - cluster/prob_snapshot/cluster_51:0.030962727977387108 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.026950236420855745 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.02022814438918611 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▌       | 205/800 [6:33:21<20:25:28, 123.58s/it]
[36m(TaskRunner pid=2823680)[0m step:205 - global_seqlen/min:359259 - global_seqlen/max:504577 - global_seqlen/minmax_diff:145318 - global_seqlen/balanced_min:442204 - global_seqlen/balanced_max:442290 - global_seqlen/mean:442235.5 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.21481259527305763) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012303562834858894 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.037854703441553283) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003447106256418111) - actor/ppo_kl:np.float64(-2.5616396713543408e-05) - actor/pg_clipfrac_lower:np.float64(3.7222518787732242e-06) - actor/grad_norm:np.float64(0.20642301129798094) - perf/mfu/actor:np.float64(0.27182813260230854) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.61547088623047) - actor/lr:np.float64(1e-06) - training/global_step:205 - training/epoch:0 - critic/score/mean:0.550000011920929 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5381888151168823 - critic/rewards/max:1.0424624681472778 - critic/rewards/min:-0.13385127484798431 - critic/advantages/mean:-0.15025074779987335 - critic/advantages/max:2.4748172760009766 - critic/advantages/min:-2.474841833114624 - critic/returns/mean:-0.15025074779987335 - critic/returns/max:2.4748172760009766 - critic/returns/min:-2.474841833114624 - response_length/mean:1352.1221923828125 - response_length/max:8192.0 - response_length/min:191.0 - response_length/clip_ratio:0.02777777798473835 - response_length_non_aborted/mean:1352.1221923828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:191.0 - response_length_non_aborted/clip_ratio:0.02777777798473835 - response/aborted_ratio:0.0 - prompt_length/mean:231.05555725097656 - prompt_length/max:355.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.818227797746658e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6222682110965252) - timing_s/agent_loop/generate_sequences/max:np.float64(32.39201888255775) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.881837326847744) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.39201888255775) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:181 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.48284943867475 - timing_s/reward:0.00013896264135837555 - timing_s/old_log_prob:10.273194057866931 - timing_s/ref:23.315666617825627 - timing_s/adv:0.0717532355338335 - timing_s/update_actor:19.452778440900147 - timing_s/update_weights:30.053621342405677 - timing_s/step:118.09124541375786 - timing_s/stop_profile:5.29363751411438e-05 - timing_per_token_ms/adv:6.294761900628264e-05 - timing_per_token_ms/update_actor:0.017065517349862572 - timing_per_token_ms/gen:0.03542050093954642 - timing_per_token_ms/ref:0.02045434868849012 - perf/total_num_tokens:1768942 - perf/time_per_step:118.09124541375786 - perf/throughput:3744.862698759198 - frontier/active_count:34.0 - frontier/completed_count:30.0 - frontier/blacklisted_count:1543.0 - frontier/mean_score:2.5491738657484593 - frontier/mean_frontier_pct:0.613463593710876 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.2931495248999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.318086548598999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:1.6257059463409995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:2.9825585939980193 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:1.8976456999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.9992020773370234 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.3762456703539296 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.6036456999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:3.7718878139566994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.903142491346551 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.6211802742900563 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.0290350635389993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9952524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.4842798950999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.5862405215529096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:2.990863366318619 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.839576395966093 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.0015095347592995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.9771244712798994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.2166283692715094 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:128.0 - frontier/cluster_42/score:2.8350396563264892 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.8526167069430395 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.7410405334715096 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.684447138501299 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.080678092943039 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.6406908511430394 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:160.0 - frontier/cluster_50/score:1.6345430652406996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.335500416950909 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.531025482898624 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.8997216989592995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:205.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.02726373470786944 - cluster/prob_snapshot/cluster_5:0.0264578164588872 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.02261663207958621 - cluster/prob_snapshot/cluster_8:0.026745533936049018 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.01875701042490345 - cluster/prob_snapshot/cluster_12:0.03441205512375713 - cluster/prob_snapshot/cluster_13:0.02189458693793026 - cluster/prob_snapshot/cluster_14:0.03460408369522114 - cluster/prob_snapshot/cluster_15:0.02741656011733091 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.018502484524001515 - cluster/prob_snapshot/cluster_18:0.04351914883942492 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03349577092814877 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03024255760463164 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.023410526316247272 - cluster/prob_snapshot/cluster_28:0.034558513697034256 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02866303343212024 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.018301667694663987 - cluster/prob_snapshot/cluster_33:0.03450787362115793 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.04430012367762042 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02309294328013066 - cluster/prob_snapshot/cluster_38:0.03434935750202339 - cluster/prob_snapshot/cluster_39:0.03711269678951551 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.03271001653676461 - cluster/prob_snapshot/cluster_43:0.032912816386513416 - cluster/prob_snapshot/cluster_44:0.031625476905664064 - cluster/prob_snapshot/cluster_45:0.0309725157094366 - cluster/prob_snapshot/cluster_46:0.03414769957166936 - cluster/prob_snapshot/cluster_47:0.035544134678943795 - cluster/prob_snapshot/cluster_48:0.018930865166415907 - cluster/prob_snapshot/cluster_49:0.04200542933513807 - cluster/prob_snapshot/cluster_50:0.018858971011134228 - cluster/prob_snapshot/cluster_51:0.026946451027451572 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.029202372960053 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.02191853932256004 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▌       | 206/800 [6:35:17<20:00:41, 121.28s/it]
[36m(TaskRunner pid=2823680)[0m step:206 - global_seqlen/min:355001 - global_seqlen/max:472368 - global_seqlen/minmax_diff:117367 - global_seqlen/balanced_min:405613 - global_seqlen/balanced_max:405676 - global_seqlen/mean:405640.75 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.22152122565882004) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012258288450539112 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06410662563575897) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00029085596570246013) - actor/ppo_kl:np.float64(5.593157757562523e-05) - actor/pg_clipfrac_lower:np.float64(1.2699859113088176e-06) - actor/grad_norm:np.float64(0.21913934374849) - perf/mfu/actor:np.float64(0.21729720816778253) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.4355583190918) - actor/lr:np.float64(1e-06) - training/global_step:206 - training/epoch:0 - critic/score/mean:0.626329779624939 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6151080131530762 - critic/rewards/max:1.0567045211791992 - critic/rewards/min:-0.22727134823799133 - critic/advantages/mean:-0.17094898223876953 - critic/advantages/max:2.474841833114624 - critic/advantages/min:-2.474848508834839 - critic/returns/mean:-0.17094898223876953 - critic/returns/max:2.474841833114624 - critic/returns/min:-2.474848508834839 - response_length/mean:1255.97607421875 - response_length/max:8192.0 - response_length/min:197.0 - response_length/clip_ratio:0.014627659693360329 - response_length_non_aborted/mean:1255.97607421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:197.0 - response_length_non_aborted/clip_ratio:0.014627659693360329 - response/aborted_ratio:0.0 - prompt_length/mean:244.46807861328125 - prompt_length/max:535.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.048823267221451e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.0254657939076424) - timing_s/agent_loop/generate_sequences/max:np.float64(29.78755768481642) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.518709813277383) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.78755768481642) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:214 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.15152755379677 - timing_s/reward:0.000194476917386055 - timing_s/old_log_prob:10.472140944562852 - timing_s/ref:21.146209251135588 - timing_s/adv:0.08479653019458055 - timing_s/update_actor:21.99872636795044 - timing_s/update_weights:28.99080482404679 - timing_s/step:115.22841493040323 - timing_s/stop_profile:7.088854908943176e-05 - timing_per_token_ms/adv:7.515197644897747e-05 - timing_per_token_ms/update_actor:0.01949664405038795 - timing_per_token_ms/gen:0.034041007728791046 - timing_per_token_ms/ref:0.018741090183523306 - perf/total_num_tokens:1622563 - perf/time_per_step:115.22841493040323 - perf/throughput:3520.3187533648083 - frontier/active_count:34.0 - frontier/completed_count:30.0 - frontier/blacklisted_count:1577.0 - frontier/mean_score:2.6136100445978 - frontier/mean_frontier_pct:0.6361707404886561 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.2931495248999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.5226605840192993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.0379941624386992 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:2.987791015798613 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.82835199 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.9992020773370234 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.3762456703539296 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.6036456999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:4.14032146976969 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.903142491346551 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.7348261920030392 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.0290350635389993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9952524384900565 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.6389959265699994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.5862405215529096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.9936043564230332 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.5877034771762646 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.3010566743315097 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.3839871298959294 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.2166283692715094 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:128.0 - frontier/cluster_42/score:2.8845277594285426 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.8526167069430395 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.2187283734300567 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.684447138501299 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.080678092943039 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:4.0484835958001275 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:160.0 - frontier/cluster_50/score:2.0441801456684896 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.534850291865636 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.531025482898624 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.8997216989592995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:206.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.02659157212211247 - cluster/prob_snapshot/cluster_5:0.025805523054661994 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.022059039583084222 - cluster/prob_snapshot/cluster_8:0.028388282208869898 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02293418059878666 - cluster/prob_snapshot/cluster_12:0.03362253926466716 - cluster/prob_snapshot/cluster_13:0.031828322441305686 - cluster/prob_snapshot/cluster_14:0.033750951480447315 - cluster/prob_snapshot/cluster_15:0.02674062976008211 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.018046322594103062 - cluster/prob_snapshot/cluster_18:0.04659232203644157 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03266996315675459 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.030775847620013133 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.022833361079302738 - cluster/prob_snapshot/cluster_28:0.03370650496912467 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02969743991170003 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.017850456721071307 - cluster/prob_snapshot/cluster_33:0.033687958590306044 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.04037354031573692 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02589450466041245 - cluster/prob_snapshot/cluster_38:0.02682774680610052 - cluster/prob_snapshot/cluster_39:0.036197717000218524 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.03246048580324993 - cluster/prob_snapshot/cluster_43:0.03210138082920813 - cluster/prob_snapshot/cluster_44:0.024968038747965573 - cluster/prob_snapshot/cluster_45:0.030208916500827653 - cluster/prob_snapshot/cluster_46:0.03330581909242891 - cluster/prob_snapshot/cluster_47:0.03466782636204263 - cluster/prob_snapshot/cluster_48:0.018464141901345335 - cluster/prob_snapshot/cluster_49:0.04555884194790873 - cluster/prob_snapshot/cluster_50:0.023003793387276355 - cluster/prob_snapshot/cluster_51:0.028525456773128614 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02848241501117364 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.021378157668141454 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▌       | 207/800 [6:37:14<19:44:54, 119.89s/it]
[36m(TaskRunner pid=2823680)[0m step:207 - global_seqlen/min:353853 - global_seqlen/max:434802 - global_seqlen/minmax_diff:80949 - global_seqlen/balanced_min:398319 - global_seqlen/balanced_max:398622 - global_seqlen/mean:398462.5 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.25540520809590816) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01340832095593214 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.026691141829360276) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00035992852983326887) - actor/ppo_kl:np.float64(-2.5376021752876453e-05) - actor/pg_clipfrac_lower:np.float64(1.4310986625787337e-06) - actor/grad_norm:np.float64(0.24420556769921228) - perf/mfu/actor:np.float64(0.2023079760588768) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.36687088012695) - actor/lr:np.float64(1e-06) - training/global_step:207 - training/epoch:0 - critic/score/mean:0.5562499761581421 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5452727675437927 - critic/rewards/max:1.026252269744873 - critic/rewards/min:-0.05588909238576889 - critic/advantages/mean:-0.1540234088897705 - critic/advantages/max:2.4748404026031494 - critic/advantages/min:-2.4748008251190186 - critic/returns/mean:-0.1540234088897705 - critic/returns/max:2.4748404026031494 - critic/returns/min:-2.4748008251190186 - response_length/mean:1227.324951171875 - response_length/max:8192.0 - response_length/min:108.0 - response_length/clip_ratio:0.014999999664723873 - response_length_non_aborted/mean:1227.324951171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:108.0 - response_length_non_aborted/clip_ratio:0.014999999664723873 - response/aborted_ratio:0.0 - prompt_length/mean:236.27999877929688 - prompt_length/max:380.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.024236351251602e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2589539708569646) - timing_s/agent_loop/generate_sequences/max:np.float64(29.953988228924572) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.343652639989159) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.953988228924572) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.78782590292394 - timing_s/reward:0.00012504681944847107 - timing_s/old_log_prob:10.193312617950141 - timing_s/ref:21.22410950437188 - timing_s/adv:0.07603695895522833 - timing_s/update_actor:23.36929097864777 - timing_s/update_weights:29.360083155333996 - timing_s/step:116.41877711750567 - timing_s/stop_profile:5.121156573295593e-05 - timing_per_token_ms/adv:6.493978819014379e-05 - timing_per_token_ms/update_actor:0.01995867308687092 - timing_per_token_ms/gen:0.032375110405683034 - timing_per_token_ms/ref:0.018126568903812746 - perf/total_num_tokens:1593850 - perf/time_per_step:116.41877711750567 - perf/throughput:3422.665225196597 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1605.0 - frontier/mean_score:2.6062049886159135 - frontier/mean_frontier_pct:0.6439339199344373 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.5540999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.2931495248999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:2.665862408813509 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.0379941624386992 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:2.991453711059029 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.8798463929999993 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.9992020773370234 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:1.9633719692477507 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.6036456999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:176.0 - frontier/cluster_18/score:4.398225028838783 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.9321997439425855 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.814378334402127 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.320324544477299 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.6389959265699994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.5862405215529096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.9936043564230332 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.411392434023385 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.3010566743315097 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.5687909909271505 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.2166283692715094 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:2.9191694315999794 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.8968316948601274 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.2187283734300567 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.684447138501299 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.080678092943039 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.733938517060089 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:160.0 - frontier/cluster_50/score:2.0441801456684896 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.534850291865636 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.531025482898624 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:1.6298051892715097 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:207.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.029697191907407546 - cluster/prob_snapshot/cluster_5:0.026663052156664084 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.022792071359446275 - cluster/prob_snapshot/cluster_8:0.030996682805402506 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02369629370350353 - cluster/prob_snapshot/cluster_12:0.03478241844072138 - cluster/prob_snapshot/cluster_13:0.033484730823686 - cluster/prob_snapshot/cluster_14:0.03487251073167572 - cluster/prob_snapshot/cluster_15:0.022828641852854284 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.018646009985665755 - cluster/prob_snapshot/cluster_18:0.05113931824650109 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03409345699334961 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03272351650162092 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.02697902325077732 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.03068429915595164 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.018443635401849182 - cluster/prob_snapshot/cluster_33:0.03480742455954978 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.039665156330868256 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02675499066107194 - cluster/prob_snapshot/cluster_38:0.02986800791965201 - cluster/prob_snapshot/cluster_39:0.03740058336677006 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.033941950127427435 - cluster/prob_snapshot/cluster_43:0.03368222339208405 - cluster/prob_snapshot/cluster_44:0.02579773787093794 - cluster/prob_snapshot/cluster_45:0.03121277234147611 - cluster/prob_snapshot/cluster_46:0.0344125863948113 - cluster/prob_snapshot/cluster_47:0.03581985377670246 - cluster/prob_snapshot/cluster_48:0.019077713615832995 - cluster/prob_snapshot/cluster_49:0.04341548440217463 - cluster/prob_snapshot/cluster_50:0.023768219756168302 - cluster/prob_snapshot/cluster_51:0.02947337049218192 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.029428898433177753 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.01895017324203622 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 18:09:41,236:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▌       | 208/800 [6:39:05<19:18:45, 117.44s/it]
[36m(TaskRunner pid=2823680)[0m step:208 - global_seqlen/min:343961 - global_seqlen/max:452523 - global_seqlen/minmax_diff:108562 - global_seqlen/balanced_min:386488 - global_seqlen/balanced_max:386561 - global_seqlen/mean:386534.25 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.2471713550882984) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01159688364714384 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.032080602053611074) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00040697033004181895) - actor/ppo_kl:np.float64(7.766430176115118e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.21234986300651842) - perf/mfu/actor:np.float64(0.2016868725956747) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.91966247558594) - actor/lr:np.float64(1e-06) - training/global_step:208 - training/epoch:0 - critic/score/mean:0.5994898080825806 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5892267823219299 - critic/rewards/max:1.0357298851013184 - critic/rewards/min:-0.0495685338973999 - critic/advantages/mean:-0.12301502376794815 - critic/advantages/max:2.4747934341430664 - critic/advantages/min:-2.4748477935791016 - critic/returns/mean:-0.12301502376794815 - critic/returns/max:2.4747934341430664 - critic/returns/min:-2.4748477935791016 - response_length/mean:1150.7296142578125 - response_length/max:8192.0 - response_length/min:234.0 - response_length/clip_ratio:0.008928571827709675 - response_length_non_aborted/mean:1150.7296142578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:234.0 - response_length_non_aborted/clip_ratio:0.008928571827709675 - response/aborted_ratio:0.0 - prompt_length/mean:251.39796447753906 - prompt_length/max:657.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.17874276638031e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8313340423628688) - timing_s/agent_loop/generate_sequences/max:np.float64(28.131596444174647) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.779831280433427) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.131596444174647) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:226 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.448563271202147 - timing_s/reward:0.0010966956615447998 - timing_s/old_log_prob:9.950568269006908 - timing_s/ref:21.32860449515283 - timing_s/adv:0.13242483511567116 - timing_s/update_actor:22.545990073122084 - timing_s/update_weights:26.641770981252193 - timing_s/step:111.44921261630952 - timing_s/stop_profile:6.624404340982437e-05 - timing_per_token_ms/adv:0.00012046637864076018 - timing_per_token_ms/update_actor:0.020510003086710506 - timing_per_token_ms/gen:0.03375028627712027 - timing_per_token_ms/ref:0.01940255196653849 - perf/total_num_tokens:1546137 - perf/time_per_step:111.44921261630952 - perf/throughput:3468.2546509389554 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1635.0 - frontier/mean_score:2.6046143995154525 - frontier/mean_frontier_pct:0.6765920673097678 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.6878699999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.5052046674299993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:160.0 - frontier/cluster_8/score:2.166103686169456 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.0379941624386992 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:2.991453711059029 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:2.8798463929999993 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.9992020773370234 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:1.9633719692477507 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:2.0225519899999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.578757520187148 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:2.3525398207598096 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:160.0 - frontier/cluster_23/score:2.8700648340814885 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.320324544477299 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.6389959265699994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.5862405215529096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.9936043564230332 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.2879747038163694 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.5107396720320567 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:2.6981536936490054 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.1516398584900562 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:2.9191694315999794 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.927782186402089 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.45310986140104 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.779112996950909 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.056474665060127 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.733938517060089 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:160.0 - frontier/cluster_50/score:2.0441801456684896 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.074395204305945 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.531025482898624 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:1.6298051892715097 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:208.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.03127165620974784 - cluster/prob_snapshot/cluster_5:0.029146461359711082 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.022805990049402074 - cluster/prob_snapshot/cluster_8:0.02520123733237053 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.023710764585064013 - cluster/prob_snapshot/cluster_12:0.03480365941046715 - cluster/prob_snapshot/cluster_13:0.033505179319973946 - cluster/prob_snapshot/cluster_14:0.03489380671909169 - cluster/prob_snapshot/cluster_15:0.022842582875723633 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.023531104740043732 - cluster/prob_snapshot/cluster_18:0.05327092866808666 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.027370303063222027 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.033391377109415005 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.026995498833624346 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.03070303748121077 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.018454898583627075 - cluster/prob_snapshot/cluster_33:0.03482868080006271 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03825349238024644 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.029210857614379245 - cluster/prob_snapshot/cluster_38:0.031391300438210555 - cluster/prob_snapshot/cluster_39:0.03666732325285054 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.03396267783895783 - cluster/prob_snapshot/cluster_43:0.03406288175774499 - cluster/prob_snapshot/cluster_44:0.028540371457873893 - cluster/prob_snapshot/cluster_45:0.032333210389152316 - cluster/prob_snapshot/cluster_46:0.034433601515071875 - cluster/prob_snapshot/cluster_47:0.03556013681449047 - cluster/prob_snapshot/cluster_48:0.019089364017268504 - cluster/prob_snapshot/cluster_49:0.0434419974231785 - cluster/prob_snapshot/cluster_50:0.02378273456161841 - cluster/prob_snapshot/cluster_51:0.024134267532360254 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02944687010879127 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.018961745756960594 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▌       | 209/800 [6:41:04<19:19:17, 117.69s/it]
[36m(TaskRunner pid=2823680)[0m step:209 - global_seqlen/min:376952 - global_seqlen/max:467644 - global_seqlen/minmax_diff:90692 - global_seqlen/balanced_min:421940 - global_seqlen/balanced_max:422071 - global_seqlen/mean:422003.75 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.25295764406522114) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012043608352541924 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.052042866744159255) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005039242177089262) - actor/ppo_kl:np.float64(4.0167528427698724e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23712061097224554) - perf/mfu/actor:np.float64(0.23736641613694348) - perf/max_memory_allocated_gb:np.float64(80.07183074951172) - perf/max_memory_reserved_gb:np.float64(85.83203125) - perf/cpu_memory_used_gb:np.float64(104.31608581542969) - actor/lr:np.float64(1e-06) - training/global_step:209 - training/epoch:0 - critic/score/mean:0.5402777791023254 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5292160511016846 - critic/rewards/max:1.0225682258605957 - critic/rewards/min:-0.06869654357433319 - critic/advantages/mean:-0.2065224051475525 - critic/advantages/max:2.4748330116271973 - critic/advantages/min:-2.4748432636260986 - critic/returns/mean:-0.2065224051475525 - critic/returns/max:2.4748330116271973 - critic/returns/min:-2.4748432636260986 - response_length/mean:1331.9527587890625 - response_length/max:8192.0 - response_length/min:154.0 - response_length/clip_ratio:0.02638888917863369 - response_length_non_aborted/mean:1331.9527587890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:154.0 - response_length_non_aborted/clip_ratio:0.02638888917863369 - response/aborted_ratio:0.0 - prompt_length/mean:234.27777099609375 - prompt_length/max:535.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00012128055095672607 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.334353401325643) - timing_s/agent_loop/generate_sequences/max:np.float64(30.670203786343336) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.446411957447708) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.670203786343336) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.05961588118225 - timing_s/reward:0.00021254271268844604 - timing_s/old_log_prob:10.732597257941961 - timing_s/ref:22.90292298514396 - timing_s/adv:0.06811671610921621 - timing_s/update_actor:21.077719559893012 - timing_s/update_weights:29.807935480959713 - timing_s/step:118.04748896509409 - timing_s/stop_profile:6.345007568597794e-05 - timing_per_token_ms/adv:6.040397425277623e-05 - timing_per_token_ms/update_actor:0.018691124621475316 - timing_per_token_ms/gen:0.034472793581252104 - timing_per_token_ms/ref:0.020309663315092995 - perf/total_num_tokens:1688015 - perf/time_per_step:118.04748896509409 - perf/throughput:3574.864266064854 - frontier/active_count:31.0 - frontier/completed_count:33.0 - frontier/blacklisted_count:1673.0 - frontier/mean_score:2.625539493930726 - frontier/mean_frontier_pct:0.6741363850872226 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.7815089999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.5052046674299993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:160.0 - frontier/cluster_8/score:2.166103686169456 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.3265959137070893 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.59401759774132 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.9992020773370234 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.2743603784734256 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.3157863929999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.105130264131003 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:2.5467778745318665 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:160.0 - frontier/cluster_23/score:2.9090453838570416 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.320324544477299 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:2.7472971485989994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.5862405215529096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.9936043564230332 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.2879747038163694 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.0575177704224394 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.1516398584900562 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:2.9191694315999794 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.927782186402089 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.45310986140104 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.2453790978656363 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.0395322655420887 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.733938517060089 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:176.0 - frontier/cluster_50/score:2.3309261019679424 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.3520766430141613 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.531025482898624 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:1.6298051892715097 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:209.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.03417434663679865 - cluster/prob_snapshot/cluster_5:0.030779599383240784 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.024083858022953146 - cluster/prob_snapshot/cluster_8:0.026613316089360965 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02858516554819178 - cluster/prob_snapshot/cluster_12:0.044157039651486395 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.036848980688080485 - cluster/prob_snapshot/cluster_15:0.027943386108386287 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.028452356951267684 - cluster/prob_snapshot/cluster_18:0.050436703471254535 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03129037867257713 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03574129198579244 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.02850811386655199 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.033754010887798946 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.019488965677581998 - cluster/prob_snapshot/cluster_33:0.03678020562573456 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.040396916659714566 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.025279201145058154 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.038721871113266304 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.035865678682702164 - cluster/prob_snapshot/cluster_43:0.03597149723950174 - cluster/prob_snapshot/cluster_44:0.03013954897922971 - cluster/prob_snapshot/cluster_45:0.027587314519378694 - cluster/prob_snapshot/cluster_46:0.0363629892107963 - cluster/prob_snapshot/cluster_47:0.03734448792233614 - cluster/prob_snapshot/cluster_48:0.0201589815546034 - cluster/prob_snapshot/cluster_49:0.04587614464037522 - cluster/prob_snapshot/cluster_50:0.028638367373039007 - cluster/prob_snapshot/cluster_51:0.02889822844890442 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03109684067142989 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.020024212572606385 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 18:13:22,831:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▋       | 210/800 [6:43:06<19:31:29, 119.13s/it]
[36m(TaskRunner pid=2823680)[0m step:210 - global_seqlen/min:413553 - global_seqlen/max:497805 - global_seqlen/minmax_diff:84252 - global_seqlen/balanced_min:439582 - global_seqlen/balanced_max:439673 - global_seqlen/mean:439627.5 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.23039552880490713) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011458245106041431 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.028926138307724614) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00046185081644464245) - actor/ppo_kl:np.float64(2.783328465891421e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.23346350093682608) - perf/mfu/actor:np.float64(0.2242737585565939) - perf/max_memory_allocated_gb:np.float64(80.81972646713257) - perf/max_memory_reserved_gb:np.float64(86.58984375) - perf/cpu_memory_used_gb:np.float64(104.95915985107422) - actor/lr:np.float64(1e-06) - training/global_step:210 - training/epoch:0 - critic/score/mean:0.5492021441459656 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5385910868644714 - critic/rewards/max:1.0103626251220703 - critic/rewards/min:-0.06308756023645401 - critic/advantages/mean:-0.08372459560632706 - critic/advantages/max:2.4746885299682617 - critic/advantages/min:-2.4748430252075195 - critic/returns/mean:-0.08372459560632706 - critic/returns/max:2.4746885299682617 - critic/returns/min:-2.4748430252075195 - response_length/mean:1355.7540283203125 - response_length/max:8192.0 - response_length/min:197.0 - response_length/clip_ratio:0.0226063821464777 - response_length_non_aborted/mean:1355.7540283203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:197.0 - response_length_non_aborted/clip_ratio:0.0226063821464777 - response/aborted_ratio:0.0 - prompt_length/mean:238.65957641601562 - prompt_length/max:657.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.932780474424362e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5159498937427998) - timing_s/agent_loop/generate_sequences/max:np.float64(30.951259260065854) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.897403480978028) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.951259260065854) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.79605202563107 - timing_s/reward:0.0001545669510960579 - timing_s/old_log_prob:10.686597366817296 - timing_s/ref:24.32901754230261 - timing_s/adv:0.09233916085213423 - timing_s/update_actor:23.219887339510024 - timing_s/update_weights:30.736272151581943 - timing_s/step:122.2428978541866 - timing_s/stop_profile:5.625654011964798e-05 - timing_per_token_ms/adv:7.701354284043125e-05 - timing_per_token_ms/update_actor:0.01936606063850764 - timing_per_token_ms/gen:0.03216790926148211 - timing_per_token_ms/ref:0.020291107450717313 - perf/total_num_tokens:1758510 - perf/time_per_step:122.2428978541866 - perf/throughput:3596.343899867256 - frontier/active_count:30.0 - frontier/completed_count:34.0 - frontier/blacklisted_count:1707.0 - frontier/mean_score:2.5772730655164255 - frontier/mean_frontier_pct:0.6902491663796262 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.7815089999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.5052046674299993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:160.0 - frontier/cluster_8/score:2.166103686169456 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.3265959137070893 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.59401759774132 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:2.399441454135916 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.3157863929999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.105130264131003 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.936331768699929 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:144.0 - frontier/cluster_27/score:2.5242271811341093 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:2.7472971485989994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:160.0 - frontier/cluster_32/score:1.4103683650870367 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:2.995523049496123 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.2015822926714583 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.3402624392957074 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:128.0 - frontier/cluster_39/score:3.1061479009430393 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:2.343418602119985 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.949447530481462 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.45310986140104 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:1.8717653685059454 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.0395322655420887 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:3.5137569619420623 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:192.0 - frontier/cluster_50/score:1.9316482713775596 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:2.5464536501099126 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.531025482898624 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:1.6298051892715097 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:210.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.03597483243324406 - cluster/prob_snapshot/cluster_5:0.03240123189311095 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.025352723372590653 - cluster/prob_snapshot/cluster_8:0.028015446702843123 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03009118364725151 - cluster/prob_snapshot/cluster_12:0.04648346665097052 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.031033336309865992 - cluster/prob_snapshot/cluster_15:0.03223112441692383 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.029951377989199984 - cluster/prob_snapshot/cluster_18:0.05309397645981077 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.037977243053102155 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.032647260844649285 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.03553235116808115 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.01824109345076882 - cluster/prob_snapshot/cluster_33:0.038742797436766625 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.04140787842142415 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.030267940568762527 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.040173571341778375 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.030308761011455836 - cluster/prob_snapshot/cluster_43:0.038146876634127774 - cluster/prob_snapshot/cluster_44:0.0317274602398342 - cluster/prob_snapshot/cluster_45:0.024208602411982385 - cluster/prob_snapshot/cluster_46:0.038278784303710754 - cluster/prob_snapshot/cluster_47:0.03931199357194794 - cluster/prob_snapshot/cluster_48:0.02122106359952493 - cluster/prob_snapshot/cluster_49:0.04544541035711566 - cluster/prob_snapshot/cluster_50:0.024983101935953923 - cluster/prob_snapshot/cluster_51:0.03293472836569172 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03273518713459084 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.02107919427291955 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 18:15:31,802:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▋       | 211/800 [6:45:01<19:18:19, 118.00s/it]
[36m(TaskRunner pid=2823680)[0m step:211 - global_seqlen/min:367147 - global_seqlen/max:499440 - global_seqlen/minmax_diff:132293 - global_seqlen/balanced_min:421173 - global_seqlen/balanced_max:421389 - global_seqlen/mean:421280.0 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.21033606512678993) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012515673413872719 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05195926668238826) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00036231966757137947) - actor/ppo_kl:np.float64(1.6516550593021546e-05) - actor/pg_clipfrac_lower:np.float64(3.466026262483663e-06) - actor/grad_norm:np.float64(0.218168493360281) - perf/mfu/actor:np.float64(0.2331388005664388) - perf/max_memory_allocated_gb:np.float64(80.81972646713257) - perf/max_memory_reserved_gb:np.float64(86.58984375) - perf/cpu_memory_used_gb:np.float64(104.45526885986328) - actor/lr:np.float64(1e-06) - training/global_step:211 - training/epoch:0 - critic/score/mean:0.581944465637207 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5706595778465271 - critic/rewards/max:1.0352901220321655 - critic/rewards/min:-0.15238803625106812 - critic/advantages/mean:-0.1295689046382904 - critic/advantages/max:2.474841594696045 - critic/advantages/min:-2.474838972091675 - critic/returns/mean:-0.1295689046382904 - critic/returns/max:2.474841594696045 - critic/returns/min:-2.474838972091675 - response_length/mean:1282.6138916015625 - response_length/max:8192.0 - response_length/min:158.0 - response_length/clip_ratio:0.01805555634200573 - response_length_non_aborted/mean:1282.6138916015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:158.0 - response_length_non_aborted/clip_ratio:0.01805555634200573 - response/aborted_ratio:0.0 - prompt_length/mean:243.04444885253906 - prompt_length/max:657.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.701998740434647e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1861182786524296) - timing_s/agent_loop/generate_sequences/max:np.float64(29.622143500484526) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.545082461146194) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.622143500484526) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.609433229081333 - timing_s/reward:0.0001885145902633667 - timing_s/old_log_prob:10.321865731850266 - timing_s/ref:21.864682130515575 - timing_s/adv:0.06435176637023687 - timing_s/update_actor:21.33713840972632 - timing_s/update_weights:29.549047864973545 - timing_s/step:115.13893520273268 - timing_s/stop_profile:8.590705692768097e-05 - timing_per_token_ms/adv:5.858287621758628e-05 - timing_per_token_ms/update_actor:0.019424345418941478 - timing_per_token_ms/gen:0.03422853204402612 - timing_per_token_ms/ref:0.019904596859384543 - perf/total_num_tokens:1685120 - perf/time_per_step:115.13893520273268 - perf/throughput:3658.8839323398697 - frontier/active_count:27.0 - frontier/completed_count:37.0 - frontier/blacklisted_count:1745.0 - frontier/mean_score:2.477826962396821 - frontier/mean_frontier_pct:0.6914636622710794 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.5052046674299993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:1.9602267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.8162725803186193 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.5286171395949624 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.59401759774132 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:2.579609017895141 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.3157863929999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:208.0 - frontier/cluster_18/score:4.373591184891702 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.936331768699929 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:144.0 - frontier/cluster_27/score:2.6669590267938763 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:2.2231080040192994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:160.0 - frontier/cluster_32/score:1.4103683650870367 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:2.996866134647286 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.3402624392957074 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.4743035306601273 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:176.0 - frontier/cluster_42/score:1.9403930214839895 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.964613271337023 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:2.0171769029807276 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:1.8717653685059454 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:3.5137569619420623 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:192.0 - frontier/cluster_50/score:1.9316482713775596 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:2.5464536501099126 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:192.0 - frontier/cluster_53/score:2.6717178380290365 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:1.6298051892715097 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:211.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03744626217288835 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.02930026640155601 - cluster/prob_snapshot/cluster_8:0.027148528064100716 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03779621744089608 - cluster/prob_snapshot/cluster_12:0.05372116975858077 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03855841274906439 - cluster/prob_snapshot/cluster_15:0.03724966813066536 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.03461495403393364 - cluster/prob_snapshot/cluster_18:0.0653737573922448 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.043890485542691045 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.03986406708403803 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.033229654342993645 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0210813209180135 - cluster/prob_snapshot/cluster_33:0.04479531610093031 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.034980806955437684 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03698437094141784 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.02900380425822544 - cluster/prob_snapshot/cluster_43:0.04431322009055466 - cluster/prob_snapshot/cluster_44:0.030151522604179377 - cluster/prob_snapshot/cluster_45:0.027978000211499723 - cluster/prob_snapshot/cluster_46:0.044238978240853884 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.02452528699392767 - cluster/prob_snapshot/cluster_49:0.05252148302265235 - cluster/prob_snapshot/cluster_50:0.028873093099421116 - cluster/prob_snapshot/cluster_51:0.038062826655574526 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03993519887437031 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.02436132791928873 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  26%|██▋       | 212/800 [6:47:04<19:29:14, 119.31s/it]
[36m(TaskRunner pid=2823680)[0m step:212 - global_seqlen/min:407382 - global_seqlen/max:531384 - global_seqlen/minmax_diff:124002 - global_seqlen/balanced_min:462660 - global_seqlen/balanced_max:462749 - global_seqlen/mean:462699.75 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.2460875721762375) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012365018017590046 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04799400681076804) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00042221085258999514) - actor/ppo_kl:np.float64(1.679959577381735e-05) - actor/pg_clipfrac_lower:np.float64(3.3239757984268423e-06) - actor/grad_norm:np.float64(0.23584616069610304) - perf/mfu/actor:np.float64(0.24317101799508425) - perf/max_memory_allocated_gb:np.float64(80.81972646713257) - perf/max_memory_reserved_gb:np.float64(86.58984375) - perf/cpu_memory_used_gb:np.float64(104.47985553741455) - actor/lr:np.float64(1e-06) - training/global_step:212 - training/epoch:0 - critic/score/mean:0.5605670213699341 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5494738817214966 - critic/rewards/max:1.017868995666504 - critic/rewards/min:-0.10631510615348816 - critic/advantages/mean:-0.17623953521251678 - critic/advantages/max:2.474825620651245 - critic/advantages/min:-2.474846839904785 - critic/returns/mean:-0.17623953521251678 - critic/returns/max:2.474825620651245 - critic/returns/min:-2.474846839904785 - response_length/mean:1284.3995361328125 - response_length/max:8192.0 - response_length/min:165.0 - response_length/clip_ratio:0.0167525764554739 - response_length_non_aborted/mean:1284.3995361328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:165.0 - response_length_non_aborted/clip_ratio:0.0167525764554739 - response/aborted_ratio:0.0 - prompt_length/mean:234.58763122558594 - prompt_length/max:365.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.834525942802429e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3140172101557255) - timing_s/agent_loop/generate_sequences/max:np.float64(32.182412401773036) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.278853997194346) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.182412401773036) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.021723373793066 - timing_s/reward:0.00017366372048854828 - timing_s/old_log_prob:11.57527229283005 - timing_s/ref:22.879269951954484 - timing_s/adv:0.10101955756545067 - timing_s/update_actor:22.766388337127864 - timing_s/update_weights:30.372946145944297 - timing_s/step:122.12354837823659 - timing_s/stop_profile:6.887596100568771e-05 - timing_per_token_ms/adv:8.570174234852873e-05 - timing_per_token_ms/update_actor:0.019314271359889394 - timing_per_token_ms/gen:0.03413457226971675 - timing_per_token_ms/ref:0.01941003648995828 - perf/total_num_tokens:1850799 - perf/time_per_step:122.12354837823659 - perf/throughput:3788.7840317818413 - frontier/active_count:26.0 - frontier/completed_count:38.0 - frontier/blacklisted_count:1775.0 - frontier/mean_score:2.375542796347217 - frontier/mean_frontier_pct:0.6960267424190169 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:144.0 - frontier/cluster_5/score:2.053643267200999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.2721587127989995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.8162725803186193 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.5286171395949624 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.59401759774132 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:2.1057263125265986 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.3157863929999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.95543223808995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:2.7668713187557135 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:144.0 - frontier/cluster_30/score:1.8561756028135095 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:176.0 - frontier/cluster_32/score:1.2872578555609255 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:2.9978062942531 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.3402624392957074 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.4743035306601273 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:192.0 - frontier/cluster_42/score:1.6582751150387927 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.975229289935916 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:2.0171769029807276 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:1.8717653685059454 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:1.6407742690999998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:3.3596298733594434 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:192.0 - frontier/cluster_50/score:1.9316482713775596 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:2.682517555076939 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:192.0 - frontier/cluster_53/score:2.7702024866203256 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.0408636324900566 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:212.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.033249781746380234 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.036787684842982366 - cluster/prob_snapshot/cluster_8:0.029406600382859117 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.040939908773094565 - cluster/prob_snapshot/cluster_12:0.05818941518524826 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.034093039150147 - cluster/prob_snapshot/cluster_15:0.0403479003549021 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.03749404454427616 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.04785031482933254 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.04479739443468907 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.03005265552309012 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.020841517819718342 - cluster/prob_snapshot/cluster_33:0.04853637756556125 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.037890327194892395 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.04006053713548641 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.026848521615753868 - cluster/prob_snapshot/cluster_43:0.0481708416041688 - cluster/prob_snapshot/cluster_44:0.032659368274492 - cluster/prob_snapshot/cluster_45:0.030305063677431937 - cluster/prob_snapshot/cluster_46:0.04791854465933391 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.026565171865027697 - cluster/prob_snapshot/cluster_49:0.054394529868895335 - cluster/prob_snapshot/cluster_50:0.03127460570190175 - cluster/prob_snapshot/cluster_51:0.04343165371593819 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0448513281466574 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.033042871387739196 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 18:19:27,347:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  27%|██▋       | 213/800 [6:49:02<19:24:52, 119.07s/it]
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 18:19:28,310:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:213 - global_seqlen/min:395696 - global_seqlen/max:449910 - global_seqlen/minmax_diff:54214 - global_seqlen/balanced_min:417273 - global_seqlen/balanced_max:417323 - global_seqlen/mean:417291.0 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.21740404588683526) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012074318714439869 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.009291645139455795) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004027379323370227) - actor/ppo_kl:np.float64(2.7751452203305308e-05) - actor/pg_clipfrac_lower:np.float64(2.3220630895111196e-06) - actor/grad_norm:np.float64(0.2274633921109713) - perf/mfu/actor:np.float64(0.22119730290990028) - perf/max_memory_allocated_gb:np.float64(80.81972646713257) - perf/max_memory_reserved_gb:np.float64(86.58984375) - perf/cpu_memory_used_gb:np.float64(104.1080093383789) - actor/lr:np.float64(1e-06) - training/global_step:213 - training/epoch:0 - critic/score/mean:0.5140306353569031 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5026988387107849 - critic/rewards/max:1.042266845703125 - critic/rewards/min:-0.16818372905254364 - critic/advantages/mean:-0.1049850657582283 - critic/advantages/max:2.474787712097168 - critic/advantages/min:-2.4748306274414062 - critic/returns/mean:-0.1049850657582283 - critic/returns/max:2.474787712097168 - critic/returns/min:-2.4748306274414062 - response_length/mean:1312.778076171875 - response_length/max:8192.0 - response_length/min:186.0 - response_length/clip_ratio:0.012755102477967739 - response_length_non_aborted/mean:1312.778076171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:186.0 - response_length_non_aborted/clip_ratio:0.012755102477967739 - response/aborted_ratio:0.0 - prompt_length/mean:234.06121826171875 - prompt_length/max:684.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.82493332028389e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.459424332715571) - timing_s/agent_loop/generate_sequences/max:np.float64(29.90622112341225) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.584062423708019) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.90622112341225) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:216 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.804904982447624 - timing_s/reward:0.00014723185449838638 - timing_s/old_log_prob:10.553246674127877 - timing_s/ref:23.662731405347586 - timing_s/adv:0.08651400078088045 - timing_s/update_actor:22.216483597643673 - timing_s/update_weights:29.552136581391096 - timing_s/step:118.28637792542577 - timing_s/stop_profile:5.19314780831337e-05 - timing_per_token_ms/adv:7.133869162172407e-05 - timing_per_token_ms/update_actor:0.018319518898513982 - timing_per_token_ms/gen:0.030902010052727046 - timing_per_token_ms/ref:0.019512082245846604 - perf/total_num_tokens:1669164 - perf/time_per_step:118.28637792542577 - perf/throughput:3527.8026710994836 - frontier/active_count:25.0 - frontier/completed_count:39.0 - frontier/blacklisted_count:1805.0 - frontier/mean_score:2.3499175275280715 - frontier/mean_frontier_pct:0.7145688186804969 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:144.0 - frontier/cluster_5/score:2.337550287040699 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.2721587127989995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.8162725803186193 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:160.0 - frontier/cluster_11/score:2.6700319977164737 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.59401759774132 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:2.374008418768619 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.3157863929999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.9688025666629647 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:2.836809923128999 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.5993229219694567 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:176.0 - frontier/cluster_32/score:1.2872578555609255 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.5381837075069953 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.4743035306601273 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:208.0 - frontier/cluster_42/score:1.4607925805271549 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.9826605029551407 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:1.7120238320865093 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:2.2102357579541616 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:1.44854198837 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.25174091135161 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:208.0 - frontier/cluster_50/score:1.6521537899642917 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:2.682517555076939 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:192.0 - frontier/cluster_53/score:2.7702024866203256 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.0408636324900566 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:213.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.039789486391032934 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.038676399255409305 - cluster/prob_snapshot/cluster_8:0.03091636296239205 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.045448948168408915 - cluster/prob_snapshot/cluster_12:0.06117691460469158 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.04041007211458845 - cluster/prob_snapshot/cluster_15:0.04241939958723302 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0394190241295153 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.05053458313979064 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.04828782099622195 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.027223473219535815 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.02191154098782385 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.043204643188937186 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.042117282869291166 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.024865427206099338 - cluster/prob_snapshot/cluster_43:0.05077047118487882 - cluster/prob_snapshot/cluster_44:0.02914185390816543 - cluster/prob_snapshot/cluster_45:0.037622354522018585 - cluster/prob_snapshot/cluster_46:0.050378727905626794 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.024656899170308368 - cluster/prob_snapshot/cluster_49:0.055350723985146594 - cluster/prob_snapshot/cluster_50:0.028122753596416257 - cluster/prob_snapshot/cluster_51:0.04566147575227862 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.04715403760632163 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0347393235478674 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 18:21:26,893:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  27%|██▋       | 214/800 [6:50:57<19:09:15, 117.67s/it]
[36m(TaskRunner pid=2823680)[0m step:214 - global_seqlen/min:379078 - global_seqlen/max:479870 - global_seqlen/minmax_diff:100792 - global_seqlen/balanced_min:445024 - global_seqlen/balanced_max:445037 - global_seqlen/mean:445028.0 - frontier/skipped_zero_acc_count:46.0 - actor/entropy:np.float64(0.24016743475889288) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01322778221219778 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.014257984410505742) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003444458646454888) - actor/ppo_kl:np.float64(3.6452422090344883e-05) - actor/pg_clipfrac_lower:np.float64(4.095427179780062e-06) - actor/grad_norm:np.float64(0.21444922550158066) - perf/mfu/actor:np.float64(0.263902264716542) - perf/max_memory_allocated_gb:np.float64(80.81972646713257) - perf/max_memory_reserved_gb:np.float64(86.58984375) - perf/cpu_memory_used_gb:np.float64(104.48406600952148) - actor/lr:np.float64(1e-06) - training/global_step:214 - training/epoch:0 - critic/score/mean:0.5076219439506531 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.4942041337490082 - critic/rewards/max:1.0244181156158447 - critic/rewards/min:-0.07407352328300476 - critic/advantages/mean:-0.09782637655735016 - critic/advantages/max:2.47471022605896 - critic/advantages/min:-2.474846363067627 - critic/returns/mean:-0.09782637655735016 - critic/returns/max:2.47471022605896 - critic/returns/min:-2.474846363067627 - response_length/mean:1413.33837890625 - response_length/max:8192.0 - response_length/min:289.0 - response_length/clip_ratio:0.018292682245373726 - response_length_non_aborted/mean:1413.33837890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:289.0 - response_length_non_aborted/clip_ratio:0.018292682245373726 - response/aborted_ratio:0.0 - prompt_length/mean:248.89024353027344 - prompt_length/max:684.0 - prompt_length/min:190.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.671637624502182e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.4083185670897365) - timing_s/agent_loop/generate_sequences/max:np.float64(30.656652812846005) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.530345363667038) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.656652812846005) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:219 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.528051297180355 - timing_s/reward:0.00013746973127126694 - timing_s/old_log_prob:9.351342899724841 - timing_s/ref:22.256565452553332 - timing_s/adv:0.05559211038053036 - timing_s/update_actor:19.88777704630047 - timing_s/update_weights:29.635190956294537 - timing_s/step:114.13834895007312 - timing_s/stop_profile:5.70220872759819e-05 - timing_per_token_ms/adv:5.098219806692304e-05 - timing_per_token_ms/update_actor:0.01823860582994517 - timing_per_token_ms/gen:0.03508391446603069 - timing_per_token_ms/ref:0.020410965160784843 - perf/total_num_tokens:1780112 - perf/time_per_step:114.13834895007312 - perf/throughput:3899.0225817500304 - frontier/active_count:23.0 - frontier/completed_count:41.0 - frontier/blacklisted_count:1851.0 - frontier/mean_score:2.3004975780823633 - frontier/mean_frontier_pct:0.7454274466830507 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:7.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:1.9362852009284892 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.2721587127989995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.8162725803186193 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:160.0 - frontier/cluster_11/score:2.7690223984015314 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.415812318418924 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:1.9210504750999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.9688025666629647 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:176.0 - frontier/cluster_27/score:2.285766946190299 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:1.4195260453786196 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:176.0 - frontier/cluster_32/score:1.2872578555609255 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:2.0767285952548966 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.632012471462089 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.9826605029551407 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:1.4984166824605565 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:2.2102357579541616 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:2.9596463929999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:1.9139793918589998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.25174091135161 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:224.0 - frontier/cluster_50/score:1.4565076529750043 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:176.0 - frontier/cluster_51/score:2.7777622885538573 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:208.0 - frontier/cluster_53/score:2.8391417406342274 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:1.7286045427430397 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:214.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03659482795631671 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.042942670400235264 - cluster/prob_snapshot/cluster_8:0.03432664907352666 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.05233314711495019 - cluster/prob_snapshot/cluster_12:0.06455715514618778 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.047098549247591937 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0363068992098759 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.05610889300357518 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.043199859247986215 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.026828336747722593 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.024328533697969945 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.03924913830817579 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.049743727590258184 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.05637080111204939 - cluster/prob_snapshot/cluster_44:0.028319330579619533 - cluster/prob_snapshot/cluster_45:0.04177235732961582 - cluster/prob_snapshot/cluster_46:0.05593584587199885 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.03617325925097672 - cluster/prob_snapshot/cluster_49:0.061456287096739254 - cluster/prob_snapshot/cluster_50:0.027527270751292295 - cluster/prob_snapshot/cluster_51:0.05249832669506995 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.05365836783356393 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.03266976673470186 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 18:23:21,348:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  27%|██▋       | 215/800 [6:53:08<19:48:26, 121.89s/it]
[36m(TaskRunner pid=2823680)[0m step:215 - global_seqlen/min:409697 - global_seqlen/max:460866 - global_seqlen/minmax_diff:51169 - global_seqlen/balanced_min:441761 - global_seqlen/balanced_max:441837 - global_seqlen/mean:441790.25 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.2448469213761237) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012738136574625969 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.024211502826801734) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00044636511821798716) - actor/ppo_kl:np.float64(-1.5822242667869728e-05) - actor/pg_clipfrac_lower:np.float64(7.004529277542226e-06) - actor/grad_norm:np.float64(0.2384588632446069) - perf/mfu/actor:np.float64(0.2053826345261613) - perf/max_memory_allocated_gb:np.float64(80.81972646713257) - perf/max_memory_reserved_gb:np.float64(86.58984375) - perf/cpu_memory_used_gb:np.float64(106.9161376953125) - actor/lr:np.float64(1e-06) - training/global_step:215 - training/epoch:0 - critic/score/mean:0.5090206265449524 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.49644115567207336 - critic/rewards/max:1.0595660209655762 - critic/rewards/min:-0.08811214566230774 - critic/advantages/mean:-0.10959137231111526 - critic/advantages/max:2.474806547164917 - critic/advantages/min:-2.4747769832611084 - critic/returns/mean:-0.10959137231111526 - critic/returns/max:2.474806547164917 - critic/returns/min:-2.4747769832611084 - response_length/mean:1422.4329833984375 - response_length/max:8192.0 - response_length/min:222.0 - response_length/clip_ratio:0.018041236326098442 - response_length_non_aborted/mean:1422.4329833984375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:222.0 - response_length_non_aborted/clip_ratio:0.018041236326098442 - response/aborted_ratio:0.0 - prompt_length/mean:236.27835083007812 - prompt_length/max:345.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.828751742839813e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7857538620010018) - timing_s/agent_loop/generate_sequences/max:np.float64(31.12961849756539) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.139453824513112) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.12961849756539) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:201 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.63426825962961 - timing_s/reward:0.000157269649207592 - timing_s/old_log_prob:11.905417260713875 - timing_s/ref:25.519608344882727 - timing_s/adv:0.08028433285653591 - timing_s/update_actor:25.367029143497348 - timing_s/update_weights:33.46033887472004 - timing_s/step:130.3820038214326 - timing_s/stop_profile:7.472001016139984e-05 - timing_per_token_ms/adv:6.237323476221753e-05 - timing_per_token_ms/update_actor:0.019707751284609023 - timing_per_token_ms/gen:0.030471122024509342 - timing_per_token_ms/ref:0.01982629070580404 - perf/total_num_tokens:1767161 - perf/time_per_step:130.3820038214326 - perf/throughput:3388.429668599534 - frontier/active_count:20.0 - frontier/completed_count:44.0 - frontier/blacklisted_count:1881.0 - frontier/mean_score:2.4575245993402404 - frontier/mean_frontier_pct:0.7502932877766672 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:176.0 - frontier/cluster_5/score:1.6553996406499425 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:2.4905110989592996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:2.1713908062230334 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:2.238315678881072 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.415812318418924 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:2.2447353325699995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.9688025666629647 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:176.0 - frontier/cluster_27/score:2.285766946190299 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:1.4195260453786196 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:2.0767285952548966 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.632012471462089 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.9826605029551407 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:160.0 - frontier/cluster_45/score:1.847165030567913 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:2.9717524750999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:128.0 - frontier/cluster_48/score:2.2397855743012998 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.1762186379461266 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:176.0 - frontier/cluster_51/score:2.8444336019877 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:208.0 - frontier/cluster_53/score:2.887399218443959 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:2.1100231799201277 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:215.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0336802252375085 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.050671132643553496 - cluster/prob_snapshot/cluster_8:0.044178414466450835 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.04554004626203907 - cluster/prob_snapshot/cluster_12:0.06949701173562922 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.05070248870754797 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.04567065845795872 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.060402296022997785 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.04650547438678636 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02888121742015752 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.04225244776415312 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.053550073764647 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.060684245108998724 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.03758182178652072 - cluster/prob_snapshot/cluster_46:0.060462313905175376 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0455699522784554 - cluster/prob_snapshot/cluster_49:0.06462231626895679 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.05787192532582036 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.05874608985031371 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.042929848606329216 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  27%|██▋       | 216/800 [6:55:09<19:43:36, 121.60s/it]
[36m(TaskRunner pid=2823680)[0m step:216 - global_seqlen/min:337527 - global_seqlen/max:454724 - global_seqlen/minmax_diff:117197 - global_seqlen/balanced_min:389050 - global_seqlen/balanced_max:389102 - global_seqlen/mean:389083.5 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.25108986442946657) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013543379493057728 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02962600079990807) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003241227014416149) - actor/ppo_kl:np.float64(6.66627084441114e-05) - actor/pg_clipfrac_lower:np.float64(7.427576691748536e-07) - actor/grad_norm:np.float64(0.24282190547539637) - perf/mfu/actor:np.float64(0.19194579227469105) - perf/max_memory_allocated_gb:np.float64(80.81972646713257) - perf/max_memory_reserved_gb:np.float64(86.58984375) - perf/cpu_memory_used_gb:np.float64(107.82467269897461) - actor/lr:np.float64(1e-06) - training/global_step:216 - training/epoch:0 - critic/score/mean:0.5695876479148865 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.560271143913269 - critic/rewards/max:1.0443650484085083 - critic/rewards/min:-0.06285855174064636 - critic/advantages/mean:-0.10936921089887619 - critic/advantages/max:2.474851131439209 - critic/advantages/min:-2.4748544692993164 - critic/returns/mean:-0.10936921089887619 - critic/returns/max:2.474851131439209 - critic/returns/min:-2.4748544692993164 - response_length/mean:1198.48974609375 - response_length/max:8192.0 - response_length/min:169.0 - response_length/clip_ratio:0.01417525764554739 - response_length_non_aborted/mean:1198.48974609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:169.0 - response_length_non_aborted/clip_ratio:0.01417525764554739 - response/aborted_ratio:0.0 - prompt_length/mean:233.3505096435547 - prompt_length/max:515.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.186845272779465e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4377019461244345) - timing_s/agent_loop/generate_sequences/max:np.float64(29.03664423339069) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.8568098105834) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.03664423339069) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.69721705466509 - timing_s/reward:0.0001401454210281372 - timing_s/old_log_prob:11.16666860319674 - timing_s/ref:25.049223210662603 - timing_s/adv:0.08848758414387703 - timing_s/update_actor:23.86938257049769 - timing_s/update_weights:29.236472951248288 - timing_s/step:120.62463732995093 - timing_s/stop_profile:6.160419434309006e-05 - timing_per_token_ms/adv:7.963904871882574e-05 - timing_per_token_ms/update_actor:0.021482504464460424 - timing_per_token_ms/gen:0.033006766521723095 - timing_per_token_ms/ref:0.02254436401381558 - perf/total_num_tokens:1556334 - perf/time_per_step:120.62463732995093 - perf/throughput:3225.572392277702 - frontier/active_count:19.0 - frontier/completed_count:45.0 - frontier/blacklisted_count:1912.0 - frontier/mean_score:2.4669342010593502 - frontier/mean_frontier_pct:0.7802907359422097 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:192.0 - frontier/cluster_5/score:1.4587797484549596 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:2.6433577692715096 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:192.0 - frontier/cluster_8/score:2.4199735643561233 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:2.238315678881072 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:3.2910686228932464 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.4713147327989997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.9688025666629647 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:176.0 - frontier/cluster_27/score:2.500036862333209 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:1.4195260453786196 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:192.0 - frontier/cluster_37/score:1.7537100166784276 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.7424087300234623 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.9878623520685985 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:160.0 - frontier/cluster_45/score:1.847165030567913 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.5802267325699995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.8678499020109098 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:2.89110352139139 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:224.0 - frontier/cluster_53/score:2.921179452910771 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.3770162259440895 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:216.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03112279260008618 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.056395542718492654 - cluster/prob_snapshot/cluster_8:0.05162968256237233 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.047754045613204205 - cluster/prob_snapshot/cluster_12:0.07021433241820207 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.05316746813367871 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.05272503677124869 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.06333884649188203 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.05333781802316336 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.03028532219996291 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.03741507461121816 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.05850877640684577 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.06374548344225782 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.03940891982177939 - cluster/prob_snapshot/cluster_46:0.07638346650827572 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.03985022767826812 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.06168115191957039 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.062322816283175306 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.05071319579631635 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  27%|██▋       | 217/800 [6:57:03<19:18:42, 119.25s/it]
[36m(TaskRunner pid=2823680)[0m step:217 - global_seqlen/min:385897 - global_seqlen/max:430633 - global_seqlen/minmax_diff:44736 - global_seqlen/balanced_min:407756 - global_seqlen/balanced_max:407898 - global_seqlen/mean:407844.0 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.25231461991813586) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013169989921152592 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.014157541329041123) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007152914292825525) - actor/ppo_kl:np.float64(6.207299974027804e-05) - actor/pg_clipfrac_lower:np.float64(2.0118625585343987e-06) - actor/grad_norm:np.float64(0.2567012868821621) - perf/mfu/actor:np.float64(0.24648906045375896) - perf/max_memory_allocated_gb:np.float64(80.81972646713257) - perf/max_memory_reserved_gb:np.float64(86.58984375) - perf/cpu_memory_used_gb:np.float64(115.19258117675781) - actor/lr:np.float64(1e-06) - training/global_step:217 - training/epoch:0 - critic/score/mean:0.5731382966041565 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5622629523277283 - critic/rewards/max:1.053873062133789 - critic/rewards/min:-0.060792211443185806 - critic/advantages/mean:-0.0870014876127243 - critic/advantages/max:2.474780321121216 - critic/advantages/min:-2.47483229637146 - critic/returns/mean:-0.0870014876127243 - critic/returns/max:2.474780321121216 - critic/returns/min:-2.47483229637146 - response_length/mean:1238.9747314453125 - response_length/max:8192.0 - response_length/min:194.0 - response_length/clip_ratio:0.013297872617840767 - response_length_non_aborted/mean:1238.9747314453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:194.0 - response_length_non_aborted/clip_ratio:0.013297872617840767 - response/aborted_ratio:0.0 - prompt_length/mean:237.3829803466797 - prompt_length/max:371.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.216368198394775e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5348049495369196) - timing_s/agent_loop/generate_sequences/max:np.float64(30.320695109665394) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.512984247826353) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.320695109665394) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.290310812182724 - timing_s/reward:0.00020894408226013184 - timing_s/old_log_prob:10.02167260274291 - timing_s/ref:21.676924441941082 - timing_s/adv:0.10610766243189573 - timing_s/update_actor:19.82468857895583 - timing_s/update_weights:28.584937715902925 - timing_s/step:113.04444322921336 - timing_s/stop_profile:7.119402289390564e-05 - timing_per_token_ms/adv:9.557346008758232e-05 - timing_per_token_ms/update_actor:0.01785652458290361 - timing_per_token_ms/gen:0.03465707727647015 - timing_per_token_ms/ref:0.01952487337380673 - perf/total_num_tokens:1631376 - perf/time_per_step:113.04444322921336 - perf/throughput:3607.819971947135 - frontier/active_count:19.0 - frontier/completed_count:45.0 - frontier/blacklisted_count:1946.0 - frontier/mean_score:2.53389762688031 - frontier/mean_frontier_pct:0.79804151962549 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:192.0 - frontier/cluster_5/score:1.4587797484549596 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.750350438490057 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:192.0 - frontier/cluster_8/score:2.5939814950492863 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:2.4668209752167503 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:3.203748036025272 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.6299203129592996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:2.378161796664075 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:192.0 - frontier/cluster_27/score:2.650025803633246 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:1.4195260453786196 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:192.0 - frontier/cluster_37/score:2.127597011674899 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.7424087300234623 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:2.991503646448019 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:160.0 - frontier/cluster_45/score:2.193015521397539 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.4061587127989994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:2.207494931407637 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:2.9237724649739727 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:224.0 - frontier/cluster_53/score:2.9448256170375395 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.5639113581608624 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:217.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03030030916922126 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.05712751955750435 - cluster/prob_snapshot/cluster_8:0.05387958076774667 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.05123833004492469 - cluster/prob_snapshot/cluster_12:0.06654503950624893 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.051762409077350066 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.05462606583172092 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.04939679055023366 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.05504367690979128 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02948497063678712 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.044192310257624295 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.05696256235808854 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.06213651201576603 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.04555111789948051 - cluster/prob_snapshot/cluster_46:0.07074931098170026 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.045851869675311345 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.06072966787686579 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06116696282642095 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.05325499405721339 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  27%|██▋       | 218/800 [6:59:29<20:33:15, 127.14s/it]
[36m(TaskRunner pid=2823680)[0m step:218 - global_seqlen/min:363907 - global_seqlen/max:533912 - global_seqlen/minmax_diff:170005 - global_seqlen/balanced_min:441585 - global_seqlen/balanced_max:441753 - global_seqlen/mean:441681.5 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.2408911237057219) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011540290899574757 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.0008308783581014723) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007792159727188769) - actor/ppo_kl:np.float64(0.00024962841365912876) - actor/pg_clipfrac_lower:np.float64(2.2678200076235102e-06) - actor/grad_norm:np.float64(0.26610354085763294) - perf/mfu/actor:np.float64(0.2287522777168065) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(119.48117065429688) - actor/lr:np.float64(1e-06) - training/global_step:218 - training/epoch:0 - critic/score/mean:0.5292553305625916 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5192429423332214 - critic/rewards/max:1.0135807991027832 - critic/rewards/min:-0.04627547040581703 - critic/advantages/mean:-0.15580852329730988 - critic/advantages/max:2.4748034477233887 - critic/advantages/min:-2.474851608276367 - critic/returns/mean:-0.15580852329730988 - critic/returns/max:2.4748034477233887 - critic/returns/min:-2.474851608276367 - response_length/mean:1377.9454345703125 - response_length/max:8192.0 - response_length/min:257.0 - response_length/clip_ratio:0.021276595070958138 - response_length_non_aborted/mean:1377.9454345703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:257.0 - response_length_non_aborted/clip_ratio:0.021276595070958138 - response/aborted_ratio:0.0 - prompt_length/mean:245.84042358398438 - prompt_length/max:447.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.353311568498611e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7664974443614483) - timing_s/agent_loop/generate_sequences/max:np.float64(31.90897059161216) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.906286091868424) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.90897059161216) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:52.66306453291327 - timing_s/reward:0.0002330336719751358 - timing_s/old_log_prob:10.806883034296334 - timing_s/ref:25.533337065950036 - timing_s/adv:0.10143702570348978 - timing_s/update_actor:23.25180205143988 - timing_s/update_weights:32.415887153707445 - timing_s/step:145.29100198391825 - timing_s/stop_profile:7.22566619515419e-05 - timing_per_token_ms/adv:8.307108805800879e-05 - timing_per_token_ms/update_actor:0.01904188813036244 - timing_per_token_ms/gen:0.05082252672747766 - timing_per_token_ms/ref:0.020910334043315534 - perf/total_num_tokens:1766726 - perf/time_per_step:145.29100198391825 - perf/throughput:3039.9783466899635 - frontier/active_count:17.0 - frontier/completed_count:47.0 - frontier/blacklisted_count:1979.0 - frontier/mean_score:2.5690992751213746 - frontier/mean_frontier_pct:0.8235769713462704 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:208.0 - frontier/cluster_5/score:1.3211458239184717 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.8252453069430397 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:208.0 - frontier/cluster_8/score:2.7157870465345004 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:2.4668209752167503 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:208.0 - frontier/cluster_12/score:2.5426236252176904 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:2.1409442190715096 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:2.5647132576648524 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:192.0 - frontier/cluster_27/score:2.755018062543272 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.8196861110164235 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:2.994052552513613 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:176.0 - frontier/cluster_45/score:2.4351108649782773 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.284311098959299 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:160.0 - frontier/cluster_48/score:1.8452464519853458 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:2.946640725481781 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:240.0 - frontier/cluster_53/score:2.9613779319262776 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.5639113581608624 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:218.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03024969139303766 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.06468839177129991 - cluster/prob_snapshot/cluster_8:0.062182174412222534 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.056481708431592295 - cluster/prob_snapshot/cluster_12:0.05821732816999627 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.05705941810867599 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.04902025252938143 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0587231047106437 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.06308042962812359 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.06456110532181866 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0685534965848422 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.055755656067510345 - cluster/prob_snapshot/cluster_46:0.07519941809873824 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.04224979158704811 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.06746792895852276 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06780536025404721 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.05870474397249901 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  27%|██▋       | 219/800 [7:01:37<20:34:00, 127.44s/it]
[36m(TaskRunner pid=2823680)[0m step:219 - global_seqlen/min:353420 - global_seqlen/max:548235 - global_seqlen/minmax_diff:194815 - global_seqlen/balanced_min:432676 - global_seqlen/balanced_max:432801 - global_seqlen/mean:432730.25 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.22549869931702102) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0124126635491848 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.009590053916326724) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006010699092870226) - actor/ppo_kl:np.float64(9.462311557172795e-05) - actor/pg_clipfrac_lower:np.float64(4.983054650994968e-06) - actor/grad_norm:np.float64(0.22950528332820305) - perf/mfu/actor:np.float64(0.2014208797143326) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(157.01879501342773) - actor/lr:np.float64(1e-06) - training/global_step:219 - training/epoch:0 - critic/score/mean:0.5992268323898315 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5892106294631958 - critic/rewards/max:1.0427888631820679 - critic/rewards/min:-0.06901472806930542 - critic/advantages/mean:-0.1981024295091629 - critic/advantages/max:2.4748141765594482 - critic/advantages/min:-2.474832773208618 - critic/returns/mean:-0.1981024295091629 - critic/returns/max:2.4748141765594482 - critic/returns/min:-2.474832773208618 - response_length/mean:1335.9652099609375 - response_length/max:8192.0 - response_length/min:193.0 - response_length/clip_ratio:0.02835051529109478 - response_length_non_aborted/mean:1335.9652099609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:193.0 - response_length_non_aborted/clip_ratio:0.02835051529109478 - response/aborted_ratio:0.0 - prompt_length/mean:239.18556213378906 - prompt_length/max:817.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010029971599578857 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8953164173290133) - timing_s/agent_loop/generate_sequences/max:np.float64(31.50358826853335) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.285726920898014) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.50358826853335) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.385638788342476 - timing_s/reward:0.00018473900854587555 - timing_s/old_log_prob:11.97682835906744 - timing_s/ref:25.601397545076907 - timing_s/adv:0.12682499270886183 - timing_s/update_actor:25.555971850641072 - timing_s/update_weights:30.749381041154265 - timing_s/step:127.83486043475568 - timing_s/stop_profile:6.300024688243866e-05 - timing_per_token_ms/adv:0.00010375785717523509 - timing_per_token_ms/update_actor:0.020907810208514708 - timing_per_token_ms/gen:0.03220348119707891 - timing_per_token_ms/ref:0.020944973803912494 - perf/total_num_tokens:1730921 - perf/time_per_step:127.83486043475568 - perf/throughput:3385.0723388621896 - frontier/active_count:15.0 - frontier/completed_count:49.0 - frontier/blacklisted_count:2008.0 - frontier/mean_score:2.4874529602058306 - frontier/mean_frontier_pct:0.8512019422873222 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:208.0 - frontier/cluster_5/score:1.3211458239184717 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:224.0 - frontier/cluster_8/score:2.2010509325741503 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:224.0 - frontier/cluster_12/score:2.079836537652383 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:2.3986609533500562 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:224.0 - frontier/cluster_23/score:2.695299280365396 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:2.8285126437802903 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.8737802777114965 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:208.0 - frontier/cluster_43/score:2.995836786759529 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:192.0 - frontier/cluster_45/score:2.004577605484794 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.199017769271509 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:160.0 - frontier/cluster_48/score:2.191672516389742 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:2.962648507837246 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:256.0 - frontier/cluster_53/score:2.372964552348394 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:192.0 - frontier/cluster_57/score:2.6947379507126037 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:219.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.035408262857739965 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.058990755276889566 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.05574206684308599 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.06678993344595581 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.06428693638898195 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.07223719264872361 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.07580746755900429 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.07702069342110489 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.08029195150452562 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.05372503889330287 - cluster/prob_snapshot/cluster_46:0.08573744094727317 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.05873940268625587 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0794024665721275 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06359824260159509 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.07222214835343381 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 220/800 [7:03:34<20:02:09, 124.36s/it]
[36m(TaskRunner pid=2823680)[0m step:220 - global_seqlen/min:404072 - global_seqlen/max:432932 - global_seqlen/minmax_diff:28860 - global_seqlen/balanced_min:422534 - global_seqlen/balanced_max:422599 - global_seqlen/mean:422571.25 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.23360265467670702) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011746251955628395 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.033914842337253504) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003769374845779523) - actor/ppo_kl:np.float64(-2.584775440287298e-05) - actor/pg_clipfrac_lower:np.float64(1.4376558283402119e-05) - actor/grad_norm:np.float64(0.22232576527378775) - perf/mfu/actor:np.float64(0.2539259700919403) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(183.20928382873535) - actor/lr:np.float64(1e-06) - training/global_step:220 - training/epoch:0 - critic/score/mean:0.5476190447807312 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5371829271316528 - critic/rewards/max:1.0750027894973755 - critic/rewards/min:-0.08677440881729126 - critic/advantages/mean:-0.171452596783638 - critic/advantages/max:2.474830389022827 - critic/advantages/min:-2.4748587608337402 - critic/returns/mean:-0.171452596783638 - critic/returns/max:2.474830389022827 - critic/returns/min:-2.4748587608337402 - response_length/mean:1303.480712890625 - response_length/max:8192.0 - response_length/min:232.0 - response_length/clip_ratio:0.013392857275903225 - response_length_non_aborted/mean:1303.480712890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:232.0 - response_length_non_aborted/clip_ratio:0.013392857275903225 - response/aborted_ratio:0.0 - prompt_length/mean:248.20237731933594 - prompt_length/max:817.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010964740067720413 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.24153561796993) - timing_s/agent_loop/generate_sequences/max:np.float64(30.7035964820534) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.847207747022367) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.7035964820534) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:188 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.726364370435476 - timing_s/reward:0.00030656158924102783 - timing_s/old_log_prob:11.531211907975376 - timing_s/ref:22.83657370787114 - timing_s/adv:0.07546730991452932 - timing_s/update_actor:19.779513488523662 - timing_s/update_weights:29.55399081669748 - timing_s/step:116.96140110492706 - timing_s/stop_profile:5.418248474597931e-05 - timing_per_token_ms/adv:7.237466797719577e-05 - timing_per_token_ms/update_actor:0.018968951233370508 - timing_per_token_ms/gen:0.037361465091102775 - timing_per_token_ms/ref:0.021900733466129942 - perf/total_num_tokens:1690285 - perf/time_per_step:116.96140110492706 - perf/throughput:3612.9120035156534 - frontier/active_count:11.0 - frontier/completed_count:53.0 - frontier/blacklisted_count:2051.0 - frontier/mean_score:2.4621291347658354 - frontier/mean_frontier_pct:0.8444334963548471 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:8.0 - frontier/replay_pool_size:4558.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:224.0 - frontier/cluster_8/score:2.440735652801905 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:224.0 - frontier/cluster_12/score:2.355885576356668 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:160.0 - frontier/cluster_17/score:1.9790626673450393 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:2.879958850646203 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.911646194398047 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:208.0 - frontier/cluster_45/score:1.703204323839356 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:3.139312438490056 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:176.0 - frontier/cluster_48/score:2.4341707614728194 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:272.0 - frontier/cluster_53/score:1.9610751866438758 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:192.0 - frontier/cluster_57/score:2.7863165654988222 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:220.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.09011918027066865 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.08698626445228815 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.09201394139077143 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.07307284796724084 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.10633660000645617 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.10750659047248344 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.06288734190515752 - cluster/prob_snapshot/cluster_46:0.1159127016665903 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.08987678506311553 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.07240869697077286 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.10287904983445519 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 221/800 [7:05:37<19:56:44, 124.02s/it]
[36m(TaskRunner pid=2823680)[0m step:221 - global_seqlen/min:402903 - global_seqlen/max:451692 - global_seqlen/minmax_diff:48789 - global_seqlen/balanced_min:420525 - global_seqlen/balanced_max:420719 - global_seqlen/mean:420595.0 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.21933088687426866) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012906979769468307 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06304595253823209) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005275647797153649) - actor/ppo_kl:np.float64(3.0079016973917183e-05) - actor/pg_clipfrac_lower:np.float64(9.154390236870583e-07) - actor/grad_norm:np.float64(0.23232903455694517) - perf/mfu/actor:np.float64(0.22008243140254075) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(186.8890733718872) - actor/lr:np.float64(1e-06) - training/global_step:221 - training/epoch:0 - critic/score/mean:0.5864361524581909 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5760935544967651 - critic/rewards/max:1.0094170570373535 - critic/rewards/min:-0.05286616459488869 - critic/advantages/mean:-0.20316369831562042 - critic/advantages/max:2.474839925765991 - critic/advantages/min:-2.474846839904785 - critic/returns/mean:-0.20316369831562042 - critic/returns/max:2.474839925765991 - critic/returns/min:-2.474846839904785 - response_length/mean:1311.2659912109375 - response_length/max:8192.0 - response_length/min:165.0 - response_length/clip_ratio:0.02393617108464241 - response_length_non_aborted/mean:1311.2659912109375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:165.0 - response_length_non_aborted/clip_ratio:0.02393617108464241 - response/aborted_ratio:0.0 - prompt_length/mean:230.13829040527344 - prompt_length/max:381.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.644683450460434e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3621586365625262) - timing_s/agent_loop/generate_sequences/max:np.float64(30.065223697572947) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.543262115741527) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.065223697572947) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.54183389246464 - timing_s/reward:0.00013121217489242554 - timing_s/old_log_prob:11.017351220361888 - timing_s/ref:24.78852667286992 - timing_s/adv:0.10610310267657042 - timing_s/update_actor:23.138664238154888 - timing_s/update_weights:30.766737116500735 - timing_s/step:122.93600463215262 - timing_s/stop_profile:8.51023942232132e-05 - timing_per_token_ms/adv:9.153637077665642e-05 - timing_per_token_ms/update_actor:0.019961992585990677 - timing_per_token_ms/gen:0.03300147848480094 - timing_per_token_ms/ref:0.021385347942665847 - perf/total_num_tokens:1682380 - perf/time_per_step:122.93600463215262 - perf/throughput:3421.2515792952477 - frontier/active_count:10.0 - frontier/completed_count:54.0 - frontier/blacklisted_count:2082.0 - frontier/mean_score:2.3990038821724005 - frontier/mean_frontier_pct:0.870448538224325 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:40.0 - frontier/replay_pool_size:5061.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:240.0 - frontier/cluster_8/score:2.008514956961333 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:240.0 - frontier/cluster_12/score:2.549119903449667 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:160.0 - frontier/cluster_17/score:2.2853438671415276 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:2.938152336078633 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:224.0 - frontier/cluster_45/score:1.492243026687549 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:3.097518706943039 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:176.0 - frontier/cluster_48/score:2.6039195330309735 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:288.0 - frontier/cluster_53/score:1.672752630650713 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.850421595849175 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:221.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.08372287230909092 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.10625743136110852 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.10387862576840592 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.09526219962062132 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.12247384666247436 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.062202609915589596 - cluster/prob_snapshot/cluster_46:0.12911686929568883 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.1085416973428577 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06972696639139925 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.11881688133276369 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 222/800 [7:07:44<20:02:56, 124.87s/it]
[36m(TaskRunner pid=2823680)[0m step:222 - global_seqlen/min:398192 - global_seqlen/max:500398 - global_seqlen/minmax_diff:102206 - global_seqlen/balanced_min:436479 - global_seqlen/balanced_max:436590 - global_seqlen/mean:436537.75 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.2164748775911458) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011784138157963753 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.010531421219639014) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00044272803693565173) - actor/ppo_kl:np.float64(-3.7258940307506625e-06) - actor/pg_clipfrac_lower:np.float64(1.148643740576672e-06) - actor/grad_norm:np.float64(0.29252548639973003) - perf/mfu/actor:np.float64(0.22856690183373207) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(186.66943359375) - actor/lr:np.float64(1e-06) - training/global_step:222 - training/epoch:0 - critic/score/mean:0.5833333134651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5744826793670654 - critic/rewards/max:1.060172200202942 - critic/rewards/min:-0.0522916242480278 - critic/advantages/mean:-0.15322688221931458 - critic/advantages/max:2.474822759628296 - critic/advantages/min:-2.4748282432556152 - critic/returns/mean:-0.15322688221931458 - critic/returns/max:2.474822759628296 - critic/returns/min:-2.4748282432556152 - response_length/mean:1297.622314453125 - response_length/max:8192.0 - response_length/min:246.0 - response_length/clip_ratio:0.024193547666072845 - response_length_non_aborted/mean:1297.622314453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:246.0 - response_length_non_aborted/clip_ratio:0.024193547666072845 - response/aborted_ratio:0.0 - prompt_length/mean:258.1397705078125 - prompt_length/max:1168.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010189507156610489 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.868777890689671) - timing_s/agent_loop/generate_sequences/max:np.float64(31.89408051315695) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.74752137750238) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.89408051315695) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.926563621498644 - timing_s/reward:0.00015987083315849304 - timing_s/old_log_prob:11.683832937851548 - timing_s/ref:24.19320499151945 - timing_s/adv:0.10151952039450407 - timing_s/update_actor:23.07355159521103 - timing_s/update_weights:32.19175854418427 - timing_s/step:126.60461840778589 - timing_s/stop_profile:6.028078496456146e-05 - timing_per_token_ms/adv:8.770683419727743e-05 - timing_per_token_ms/update_actor:0.01993417774472718 - timing_per_token_ms/gen:0.03617717229040568 - timing_per_token_ms/ref:0.02090149175888753 - perf/total_num_tokens:1746151 - perf/time_per_step:126.60461840778589 - perf/throughput:3448.039696260827 - frontier/active_count:8.0 - frontier/completed_count:56.0 - frontier/blacklisted_count:2113.0 - frontier/mean_score:2.6000972023638145 - frontier/mean_frontier_pct:0.8744485072834327 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:48.0 - frontier/replay_pool_size:5202.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:240.0 - frontier/cluster_8/score:2.3059604698729332 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:240.0 - frontier/cluster_12/score:2.684383932414767 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:176.0 - frontier/cluster_17/score:2.499740706999069 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:2.956706635255043 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:3.0682630948601273 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:192.0 - frontier/cluster_48/score:2.7227436731216814 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:288.0 - frontier/cluster_53/score:2.070926841455499 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:222.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.11085933959394507 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.12905209514736243 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.11980572604486718 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.12017534886419301 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.14214404333456387 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.1475071341597677 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.13089624450608836 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.09956006834921242 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 223/800 [7:09:40<19:33:40, 122.05s/it]
[36m(TaskRunner pid=2823680)[0m step:223 - global_seqlen/min:392691 - global_seqlen/max:435479 - global_seqlen/minmax_diff:42788 - global_seqlen/balanced_min:415312 - global_seqlen/balanced_max:415407 - global_seqlen/mean:415353.5 - frontier/skipped_zero_acc_count:22.0 - actor/entropy:np.float64(0.21317148911503125) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012642324902117252 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.025521870848024264) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005367356975769342) - actor/ppo_kl:np.float64(-2.227274370839813e-06) - actor/pg_clipfrac_lower:np.float64(6.623684865464281e-06) - actor/grad_norm:np.float64(0.4162613941090448) - perf/mfu/actor:np.float64(0.1534811821043674) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(187.79683303833008) - actor/lr:np.float64(1e-06) - training/global_step:223 - training/epoch:0 - critic/score/mean:0.5849056839942932 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5746303796768188 - critic/rewards/max:1.1344856023788452 - critic/rewards/min:-0.12676019966602325 - critic/advantages/mean:-0.22629021108150482 - critic/advantages/max:2.4747989177703857 - critic/advantages/min:-2.474862813949585 - critic/returns/mean:-0.22629021108150482 - critic/returns/max:2.4747989177703857 - critic/returns/min:-2.474862813949585 - response_length/mean:1290.4893798828125 - response_length/max:8192.0 - response_length/min:215.0 - response_length/clip_ratio:0.01886792480945587 - response_length_non_aborted/mean:1290.4893798828125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:215.0 - response_length_non_aborted/clip_ratio:0.01886792480945587 - response/aborted_ratio:0.0 - prompt_length/mean:240.66981506347656 - prompt_length/max:452.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.864754974842072e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6625851262360811) - timing_s/agent_loop/generate_sequences/max:np.float64(30.893069938756526) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.521481379691977) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.893069938756526) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.871338966302574 - timing_s/reward:0.00012728292495012283 - timing_s/old_log_prob:11.634444423019886 - timing_s/ref:13.159416560083628 - timing_s/adv:0.13572358340024948 - timing_s/update_actor:32.37444638926536 - timing_s/update_weights:24.532629597000778 - timing_s/step:115.19795260578394 - timing_s/stop_profile:7.675774395465851e-05 - timing_per_token_ms/adv:0.00010452955885735964 - timing_per_token_ms/update_actor:0.024933666755183295 - timing_per_token_ms/gen:0.030037729732031393 - timing_per_token_ms/ref:0.010134922563820594 - perf/total_num_tokens:1661414 - perf/time_per_step:115.19795260578394 - perf/throughput:3605.5632118859867 - frontier/active_count:5.0 - frontier/completed_count:59.0 - frontier/blacklisted_count:2134.0 - frontier/mean_score:2.793054028419308 - frontier/mean_frontier_pct:0.8281845329161157 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:5.0 - frontier/batch_hard_count:0.0 - frontier/force_completed_count:10.0 - frontier/replay_slots_count:64.0 - frontier/replay_pool_size:5523.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.492052264931398 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:176.0 - frontier/cluster_17/score:2.649818494899348 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:2.9696946446785297 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:3.047784166402089 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:192.0 - frontier/cluster_48/score:2.805920571185177 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:300.0 - frontier/cluster_53/score:0.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:209.0 - frontier/cluster_57/score:0.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:223.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.17844640594666492 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.18974344698938586 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.2126485642212363 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.2182402585407159 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.20092132430199716 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 224/800 [7:11:44<19:37:41, 122.68s/it]
[36m(TaskRunner pid=2823680)[0m step:224 - global_seqlen/min:385847 - global_seqlen/max:466754 - global_seqlen/minmax_diff:80907 - global_seqlen/balanced_min:419724 - global_seqlen/balanced_max:419857 - global_seqlen/mean:419803.5 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.19262813940188106) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011227523908019066 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02185577354975976) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00043157642981636206) - actor/ppo_kl:np.float64(0.000188496738397132) - actor/pg_clipfrac_lower:np.float64(6.924040833985842e-06) - actor/grad_norm:np.float64(0.22532494079608184) - perf/mfu/actor:np.float64(0.20176610060030298) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(190.55043983459473) - actor/lr:np.float64(1e-06) - training/global_step:224 - training/epoch:0 - critic/score/mean:0.6159793734550476 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6071451902389526 - critic/rewards/max:1.0178494453430176 - critic/rewards/min:-0.05517469719052315 - critic/advantages/mean:-0.18438231945037842 - critic/advantages/max:2.4747018814086914 - critic/advantages/min:-2.4748213291168213 - critic/returns/mean:-0.18438231945037842 - critic/returns/max:2.4747018814086914 - critic/returns/min:-2.4748213291168213 - response_length/mean:1302.70751953125 - response_length/max:8192.0 - response_length/min:260.0 - response_length/clip_ratio:0.02835051529109478 - response_length_non_aborted/mean:1302.70751953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:260.0 - response_length_non_aborted/clip_ratio:0.02835051529109478 - response/aborted_ratio:0.0 - prompt_length/mean:236.93814086914062 - prompt_length/max:391.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010647810995578766 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.1212076554074883) - timing_s/agent_loop/generate_sequences/max:np.float64(30.78937246091664) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.700951430280838) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.78937246091664) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:215 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.51014214102179 - timing_s/reward:0.000174584798514843 - timing_s/old_log_prob:11.879213843494654 - timing_s/ref:22.891392313875258 - timing_s/adv:0.11260439641773701 - timing_s/update_actor:25.047462615184486 - timing_s/update_weights:30.804112072102726 - timing_s/step:123.84010963980108 - timing_s/stop_profile:5.328003317117691e-05 - timing_per_token_ms/adv:9.424815458917612e-05 - timing_per_token_ms/update_actor:0.020964342456620745 - timing_per_token_ms/gen:0.03215957066124357 - timing_per_token_ms/ref:0.019159744647587816 - perf/total_num_tokens:1679214 - perf/time_per_step:123.84010963980108 - perf/throughput:3389.8831422309963 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:31.0 - frontier/mean_score:2.075 - frontier/mean_frontier_pct:0.007582963432678081 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.9 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.3 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:224.0 - cluster/prob_snapshot/cluster_0:0.01506024096385542 - cluster/prob_snapshot/cluster_1:0.01506024096385542 - cluster/prob_snapshot/cluster_2:0.01506024096385542 - cluster/prob_snapshot/cluster_3:0.01506024096385542 - cluster/prob_snapshot/cluster_4:0.01506024096385542 - cluster/prob_snapshot/cluster_5:0.01506024096385542 - cluster/prob_snapshot/cluster_6:0.01506024096385542 - cluster/prob_snapshot/cluster_7:0.01731927710843373 - cluster/prob_snapshot/cluster_8:0.01506024096385542 - cluster/prob_snapshot/cluster_9:0.01506024096385542 - cluster/prob_snapshot/cluster_10:0.01506024096385542 - cluster/prob_snapshot/cluster_11:0.01506024096385542 - cluster/prob_snapshot/cluster_12:0.02183734939759036 - cluster/prob_snapshot/cluster_13:0.01731927710843373 - cluster/prob_snapshot/cluster_14:0.01731927710843373 - cluster/prob_snapshot/cluster_15:0.01731927710843373 - cluster/prob_snapshot/cluster_16:0.01506024096385542 - cluster/prob_snapshot/cluster_17:0.01731927710843373 - cluster/prob_snapshot/cluster_18:0.01506024096385542 - cluster/prob_snapshot/cluster_19:0.01506024096385542 - cluster/prob_snapshot/cluster_20:0.01506024096385542 - cluster/prob_snapshot/cluster_21:0.01506024096385542 - cluster/prob_snapshot/cluster_22:0.01506024096385542 - cluster/prob_snapshot/cluster_23:0.01731927710843373 - cluster/prob_snapshot/cluster_24:0.01506024096385542 - cluster/prob_snapshot/cluster_25:0.01506024096385542 - cluster/prob_snapshot/cluster_26:0.01506024096385542 - cluster/prob_snapshot/cluster_27:0.01731927710843373 - cluster/prob_snapshot/cluster_28:0.01506024096385542 - cluster/prob_snapshot/cluster_29:0.01506024096385542 - cluster/prob_snapshot/cluster_30:0.01506024096385542 - cluster/prob_snapshot/cluster_31:0.01731927710843373 - cluster/prob_snapshot/cluster_32:0.01506024096385542 - cluster/prob_snapshot/cluster_33:0.01506024096385542 - cluster/prob_snapshot/cluster_34:0.01506024096385542 - cluster/prob_snapshot/cluster_35:0.01506024096385542 - cluster/prob_snapshot/cluster_36:0.01731927710843373 - cluster/prob_snapshot/cluster_37:0.01506024096385542 - cluster/prob_snapshot/cluster_38:0.01506024096385542 - cluster/prob_snapshot/cluster_39:0.01731927710843373 - cluster/prob_snapshot/cluster_40:0.01506024096385542 - cluster/prob_snapshot/cluster_41:0.01506024096385542 - cluster/prob_snapshot/cluster_42:0.01506024096385542 - cluster/prob_snapshot/cluster_43:0.01506024096385542 - cluster/prob_snapshot/cluster_44:0.01506024096385542 - cluster/prob_snapshot/cluster_45:0.01506024096385542 - cluster/prob_snapshot/cluster_46:0.01731927710843373 - cluster/prob_snapshot/cluster_47:0.01506024096385542 - cluster/prob_snapshot/cluster_48:0.01731927710843373 - cluster/prob_snapshot/cluster_49:0.01506024096385542 - cluster/prob_snapshot/cluster_50:0.01506024096385542 - cluster/prob_snapshot/cluster_51:0.01506024096385542 - cluster/prob_snapshot/cluster_52:0.01506024096385542 - cluster/prob_snapshot/cluster_53:0.01506024096385542 - cluster/prob_snapshot/cluster_54:0.01506024096385542 - cluster/prob_snapshot/cluster_55:0.01506024096385542 - cluster/prob_snapshot/cluster_56:0.01506024096385542 - cluster/prob_snapshot/cluster_57:0.01506024096385542 - cluster/prob_snapshot/cluster_58:0.01731927710843373 - cluster/prob_snapshot/cluster_59:0.01506024096385542 - cluster/prob_snapshot/cluster_60:0.012801204819277106 - cluster/prob_snapshot/cluster_61:0.01506024096385542 - cluster/prob_snapshot/cluster_62:0.01731927710843373 - cluster/prob_snapshot/cluster_63:0.01506024096385542
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_225
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 225}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 225/800 [7:16:52<28:30:26, 178.48s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:225 - global_seqlen/min:349966 - global_seqlen/max:469415 - global_seqlen/minmax_diff:119449 - global_seqlen/balanced_min:419981 - global_seqlen/balanced_max:420104 - global_seqlen/mean:420017.0 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.20305935285833418) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010326082818210125 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0010327406057513144) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005392837511139152) - actor/ppo_kl:np.float64(-0.00021185450476325838) - actor/pg_clipfrac_lower:np.float64(3.2212853307564916e-05) - actor/grad_norm:np.float64(0.21661312648883232) - perf/mfu/actor:np.float64(0.1952109559587914) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(187.56788444519043) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.11041666666666666) - val-aux/aime2024/reward/std@16:np.float64(0.11562876074642786) - val-aux/aime2024/reward/best@2/mean:np.float64(0.16163333333333335) - val-aux/aime2024/reward/best@2/std:np.float64(0.10592734486002421) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.0583) - val-aux/aime2024/reward/worst@2/std:np.float64(0.09266812971467957) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.1114) - val-aux/aime2024/reward/maj@2/std:np.float64(0.11522314334920654) - val-aux/aime2024/reward/best@4/mean:np.float64(0.20536666666666664) - val-aux/aime2024/reward/best@4/std:np.float64(0.07568351260591508) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.01966666666666667) - val-aux/aime2024/reward/worst@4/std:np.float64(0.053599696448936425) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.14293333333333333) - val-aux/aime2024/reward/maj@4/std:np.float64(0.10266413606732906) - val-aux/aime2024/reward/best@8/mean:np.float64(0.23340000000000002) - val-aux/aime2024/reward/best@8/std:np.float64(0.04360056629548094) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0027) - val-aux/aime2024/reward/worst@8/std:np.float64(0.019325615164885537) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.1702) - val-aux/aime2024/reward/maj@8/std:np.float64(0.0809470412403074) - val-aux/aime2024/reward/best@16/mean:np.float64(0.25053333333333333) - val-aux/aime2024/reward/best@16/std:np.float64(0.028561257929447914) - val-aux/aime2024/reward/worst@16/mean:np.float64(6.666666666666667e-05) - val-aux/aime2024/reward/worst@16/std:np.float64(0.002107130750570548) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.1895333333333333) - val-aux/aime2024/reward/maj@16/std:np.float64(0.04873079435026576) - val-aux/aime2024/score/mean@16:np.float64(0.11041666666666666) - val-aux/aime2024/score/std@16:np.float64(0.11562876074642786) - val-aux/aime2024/score/best@2/mean:np.float64(0.16163333333333335) - val-aux/aime2024/score/best@2/std:np.float64(0.10592734486002421) - val-aux/aime2024/score/worst@2/mean:np.float64(0.0583) - val-aux/aime2024/score/worst@2/std:np.float64(0.09266812971467957) - val-aux/aime2024/score/maj@2/mean:np.float64(0.1114) - val-aux/aime2024/score/maj@2/std:np.float64(0.11522314334920654) - val-aux/aime2024/score/best@4/mean:np.float64(0.20536666666666664) - val-aux/aime2024/score/best@4/std:np.float64(0.07568351260591508) - val-aux/aime2024/score/worst@4/mean:np.float64(0.01966666666666667) - val-aux/aime2024/score/worst@4/std:np.float64(0.053599696448936425) - val-aux/aime2024/score/maj@4/mean:np.float64(0.14293333333333333) - val-aux/aime2024/score/maj@4/std:np.float64(0.10266413606732906) - val-aux/aime2024/score/best@8/mean:np.float64(0.23340000000000002) - val-aux/aime2024/score/best@8/std:np.float64(0.04360056629548094) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0027) - val-aux/aime2024/score/worst@8/std:np.float64(0.019325615164885537) - val-aux/aime2024/score/maj@8/mean:np.float64(0.1702) - val-aux/aime2024/score/maj@8/std:np.float64(0.0809470412403074) - val-aux/aime2024/score/best@16/mean:np.float64(0.25053333333333333) - val-aux/aime2024/score/best@16/std:np.float64(0.028561257929447914) - val-aux/aime2024/score/worst@16/mean:np.float64(6.666666666666667e-05) - val-aux/aime2024/score/worst@16/std:np.float64(0.002107130750570548) - val-aux/aime2024/score/maj@16/mean:np.float64(0.1895333333333333) - val-aux/aime2024/score/maj@16/std:np.float64(0.04873079435026576) - val-core/aime2024/acc/mean@16:np.float64(0.11041666666666666) - val-aux/aime2024/acc/std@16:np.float64(0.11562876074642786) - val-aux/aime2024/acc/best@2/mean:np.float64(0.16163333333333335) - val-aux/aime2024/acc/best@2/std:np.float64(0.10592734486002421) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.0583) - val-aux/aime2024/acc/worst@2/std:np.float64(0.09266812971467957) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.1114) - val-aux/aime2024/acc/maj@2/std:np.float64(0.11522314334920654) - val-aux/aime2024/acc/best@4/mean:np.float64(0.20536666666666664) - val-aux/aime2024/acc/best@4/std:np.float64(0.07568351260591508) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.01966666666666667) - val-aux/aime2024/acc/worst@4/std:np.float64(0.053599696448936425) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.14293333333333333) - val-aux/aime2024/acc/maj@4/std:np.float64(0.10266413606732906) - val-aux/aime2024/acc/best@8/mean:np.float64(0.23340000000000002) - val-aux/aime2024/acc/best@8/std:np.float64(0.04360056629548094) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0027) - val-aux/aime2024/acc/worst@8/std:np.float64(0.019325615164885537) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.1702) - val-aux/aime2024/acc/maj@8/std:np.float64(0.0809470412403074) - val-core/aime2024/acc/best@16/mean:np.float64(0.25053333333333333) - val-core/aime2024/acc/best@16/std:np.float64(0.028561257929447914) - val-aux/aime2024/acc/worst@16/mean:np.float64(6.666666666666667e-05) - val-aux/aime2024/acc/worst@16/std:np.float64(0.002107130750570548) - val-core/aime2024/acc/maj@16/mean:np.float64(0.1895333333333333) - val-core/aime2024/acc/maj@16/std:np.float64(0.04873079435026576) - val-aux/aime2025/reward/mean@16:np.float64(0.058333333333333334) - val-aux/aime2025/reward/std@16:np.float64(0.09053667902010248) - val-aux/aime2025/reward/best@2/mean:np.float64(0.09686666666666666) - val-aux/aime2025/reward/best@2/std:np.float64(0.10049048633843555) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0208) - val-aux/aime2025/reward/worst@2/std:np.float64(0.05463762457509744) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.058166666666666665) - val-aux/aime2025/reward/maj@2/std:np.float64(0.09190600548408692) - val-aux/aime2025/reward/best@4/mean:np.float64(0.14026666666666668) - val-aux/aime2025/reward/best@4/std:np.float64(0.09157043831186738) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0032333333333333333) - val-aux/aime2025/reward/worst@4/std:np.float64(0.018604628687625514) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.07293333333333334) - val-aux/aime2025/reward/maj@4/std:np.float64(0.09045046255396919) - val-aux/aime2025/reward/best@8/mean:np.float64(0.1776) - val-aux/aime2025/reward/best@8/std:np.float64(0.07089697850548525) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2025/reward/worst@8/std:np.float64(0.002876566563801983) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.0873) - val-aux/aime2025/reward/maj@8/std:np.float64(0.0840398979190248) - val-aux/aime2025/reward/best@16/mean:np.float64(0.20486666666666667) - val-aux/aime2025/reward/best@16/std:np.float64(0.04741527061603892) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.10093333333333335) - val-aux/aime2025/reward/maj@16/std:np.float64(0.07170879759573116) - val-aux/aime2025/score/mean@16:np.float64(0.058333333333333334) - val-aux/aime2025/score/std@16:np.float64(0.09053667902010248) - val-aux/aime2025/score/best@2/mean:np.float64(0.09686666666666666) - val-aux/aime2025/score/best@2/std:np.float64(0.10049048633843555) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0208) - val-aux/aime2025/score/worst@2/std:np.float64(0.05463762457509744) - val-aux/aime2025/score/maj@2/mean:np.float64(0.058166666666666665) - val-aux/aime2025/score/maj@2/std:np.float64(0.09190600548408692) - val-aux/aime2025/score/best@4/mean:np.float64(0.14026666666666668) - val-aux/aime2025/score/best@4/std:np.float64(0.09157043831186738) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0032333333333333333) - val-aux/aime2025/score/worst@4/std:np.float64(0.018604628687625514) - val-aux/aime2025/score/maj@4/mean:np.float64(0.07293333333333334) - val-aux/aime2025/score/maj@4/std:np.float64(0.09045046255396919) - val-aux/aime2025/score/best@8/mean:np.float64(0.1776) - val-aux/aime2025/score/best@8/std:np.float64(0.07089697850548525) - val-aux/aime2025/score/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2025/score/worst@8/std:np.float64(0.002876566563801983) - val-aux/aime2025/score/maj@8/mean:np.float64(0.0873) - val-aux/aime2025/score/maj@8/std:np.float64(0.0840398979190248) - val-aux/aime2025/score/best@16/mean:np.float64(0.20486666666666667) - val-aux/aime2025/score/best@16/std:np.float64(0.04741527061603892) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.10093333333333335) - val-aux/aime2025/score/maj@16/std:np.float64(0.07170879759573116) - val-core/aime2025/acc/mean@16:np.float64(0.058333333333333334) - val-aux/aime2025/acc/std@16:np.float64(0.09053667902010248) - val-aux/aime2025/acc/best@2/mean:np.float64(0.09686666666666666) - val-aux/aime2025/acc/best@2/std:np.float64(0.10049048633843555) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0208) - val-aux/aime2025/acc/worst@2/std:np.float64(0.05463762457509744) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.058166666666666665) - val-aux/aime2025/acc/maj@2/std:np.float64(0.09190600548408692) - val-aux/aime2025/acc/best@4/mean:np.float64(0.14026666666666668) - val-aux/aime2025/acc/best@4/std:np.float64(0.09157043831186738) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0032333333333333333) - val-aux/aime2025/acc/worst@4/std:np.float64(0.018604628687625514) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.07293333333333334) - val-aux/aime2025/acc/maj@4/std:np.float64(0.09045046255396919) - val-aux/aime2025/acc/best@8/mean:np.float64(0.1776) - val-aux/aime2025/acc/best@8/std:np.float64(0.07089697850548525) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.00013333333333333334) - val-aux/aime2025/acc/worst@8/std:np.float64(0.002876566563801983) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.0873) - val-aux/aime2025/acc/maj@8/std:np.float64(0.0840398979190248) - val-core/aime2025/acc/best@16/mean:np.float64(0.20486666666666667) - val-core/aime2025/acc/best@16/std:np.float64(0.04741527061603892) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.10093333333333335) - val-core/aime2025/acc/maj@16/std:np.float64(0.07170879759573116) - val-aux/math500/reward/mean@4:np.float64(0.6975) - val-aux/math500/reward/std@4:np.float64(0.13266471820493492) - val-aux/math500/reward/best@2/mean:np.float64(0.757028) - val-aux/math500/reward/best@2/std:np.float64(0.10889573334005868) - val-aux/math500/reward/worst@2/mean:np.float64(0.638394) - val-aux/math500/reward/worst@2/std:np.float64(0.119458916520759) - val-aux/math500/reward/maj@2/mean:np.float64(0.6979339999999999) - val-aux/math500/reward/maj@2/std:np.float64(0.13276655808316215) - val-aux/math500/reward/best@4/mean:np.float64(0.800472) - val-aux/math500/reward/best@4/std:np.float64(0.06723020808450676) - val-aux/math500/reward/worst@4/mean:np.float64(0.5873079999999999) - val-aux/math500/reward/worst@4/std:np.float64(0.0848918739509777) - val-aux/math500/reward/maj@4/mean:np.float64(0.712076) - val-aux/math500/reward/maj@4/std:np.float64(0.12118813272345434) - val-aux/math500/score/mean@4:np.float64(0.6975) - val-aux/math500/score/std@4:np.float64(0.13266471820493492) - val-aux/math500/score/best@2/mean:np.float64(0.757028) - val-aux/math500/score/best@2/std:np.float64(0.10889573334005868) - val-aux/math500/score/worst@2/mean:np.float64(0.638394) - val-aux/math500/score/worst@2/std:np.float64(0.119458916520759) - val-aux/math500/score/maj@2/mean:np.float64(0.6979339999999999) - val-aux/math500/score/maj@2/std:np.float64(0.13276655808316215) - val-aux/math500/score/best@4/mean:np.float64(0.800472) - val-aux/math500/score/best@4/std:np.float64(0.06723020808450676) - val-aux/math500/score/worst@4/mean:np.float64(0.5873079999999999) - val-aux/math500/score/worst@4/std:np.float64(0.0848918739509777) - val-aux/math500/score/maj@4/mean:np.float64(0.712076) - val-aux/math500/score/maj@4/std:np.float64(0.12118813272345434) - val-core/math500/acc/mean@4:np.float64(0.6975) - val-aux/math500/acc/std@4:np.float64(0.13266471820493492) - val-aux/math500/acc/best@2/mean:np.float64(0.757028) - val-aux/math500/acc/best@2/std:np.float64(0.10889573334005868) - val-aux/math500/acc/worst@2/mean:np.float64(0.638394) - val-aux/math500/acc/worst@2/std:np.float64(0.119458916520759) - val-aux/math500/acc/maj@2/mean:np.float64(0.6979339999999999) - val-aux/math500/acc/maj@2/std:np.float64(0.13276655808316215) - val-core/math500/acc/best@4/mean:np.float64(0.800472) - val-core/math500/acc/best@4/std:np.float64(0.06723020808450676) - val-aux/math500/acc/worst@4/mean:np.float64(0.5873079999999999) - val-aux/math500/acc/worst@4/std:np.float64(0.0848918739509777) - val-core/math500/acc/maj@4/mean:np.float64(0.712076) - val-core/math500/acc/maj@4/std:np.float64(0.12118813272345434) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.06114864864864865 - val-aux/aime2024/response_length/clip_ratio:0.12708333333333333 - val-aux/aime2025/response_length/clip_ratio:0.11458333333333333 - val-aux/math500/response_length/clip_ratio:0.0325 - training/global_step:225 - training/epoch:0 - critic/score/mean:0.5502577424049377 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5426386594772339 - critic/rewards/max:1.0458835363388062 - critic/rewards/min:-0.07543711364269257 - critic/advantages/mean:-0.1587866246700287 - critic/advantages/max:2.474778652191162 - critic/advantages/min:-2.47485089302063 - critic/returns/mean:-0.1587866246700287 - critic/returns/max:2.474778652191162 - critic/returns/min:-2.47485089302063 - response_length/mean:1377.737060546875 - response_length/max:8192.0 - response_length/min:222.0 - response_length/clip_ratio:0.019329896196722984 - response_length_non_aborted/mean:1377.737060546875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:222.0 - response_length_non_aborted/clip_ratio:0.019329896196722984 - response/aborted_ratio:0.0 - prompt_length/mean:243.09278869628906 - prompt_length/max:667.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.091757237911224e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.834619882516563) - timing_s/agent_loop/generate_sequences/max:np.float64(30.10826869495213) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.63999137371502) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.10826869495213) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.398679613135755 - timing_s/reward:0.00017264112830162048 - timing_s/old_log_prob:11.508281826041639 - timing_s/ref:24.089892745949328 - timing_s/adv:0.12259378097951412 - timing_s/update_actor:25.819980800151825 - timing_s/save_checkpoint:52.377286962233484 - timing_s/update_weights:31.341772156767547 - timing_s/step:178.0636379038915 - timing_s/testing:130.3027075920254 - timing_s/stop_profile:0.0008622715249657631 - timing_per_token_ms/adv:9.746962147073228e-05 - timing_per_token_ms/update_actor:0.020528478156595215 - timing_per_token_ms/gen:0.03030394941385261 - timing_per_token_ms/ref:0.01915295138511623 - perf/total_num_tokens:1680068 - perf/time_per_step:178.0636379038915 - perf/throughput:2358.8027569486194 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:62.0 - frontier/mean_score:2.10265625 - frontier/mean_frontier_pct:0.02001725479810128 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.3 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.9299999999999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.9 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.51 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.3 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.7 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.51 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:225.0 - cluster/prob_snapshot/cluster_0:0.014862153526045925 - cluster/prob_snapshot/cluster_1:0.014862153526045925 - cluster/prob_snapshot/cluster_2:0.014862153526045925 - cluster/prob_snapshot/cluster_3:0.014862153526045925 - cluster/prob_snapshot/cluster_4:0.014862153526045925 - cluster/prob_snapshot/cluster_5:0.014862153526045925 - cluster/prob_snapshot/cluster_6:0.014862153526045925 - cluster/prob_snapshot/cluster_7:0.01709147655495281 - cluster/prob_snapshot/cluster_8:0.01709147655495281 - cluster/prob_snapshot/cluster_9:0.014862153526045925 - cluster/prob_snapshot/cluster_10:0.014862153526045925 - cluster/prob_snapshot/cluster_11:0.01709147655495281 - cluster/prob_snapshot/cluster_12:0.02177305491565728 - cluster/prob_snapshot/cluster_13:0.01709147655495281 - cluster/prob_snapshot/cluster_14:0.018652002675187632 - cluster/prob_snapshot/cluster_15:0.01709147655495281 - cluster/prob_snapshot/cluster_16:0.01709147655495281 - cluster/prob_snapshot/cluster_17:0.01709147655495281 - cluster/prob_snapshot/cluster_18:0.014862153526045925 - cluster/prob_snapshot/cluster_19:0.014862153526045925 - cluster/prob_snapshot/cluster_20:0.02155012261276659 - cluster/prob_snapshot/cluster_21:0.01709147655495281 - cluster/prob_snapshot/cluster_22:0.014862153526045925 - cluster/prob_snapshot/cluster_23:0.018652002675187632 - cluster/prob_snapshot/cluster_24:0.014862153526045925 - cluster/prob_snapshot/cluster_25:0.014862153526045925 - cluster/prob_snapshot/cluster_26:0.014862153526045925 - cluster/prob_snapshot/cluster_27:0.01709147655495281 - cluster/prob_snapshot/cluster_28:0.012632830497139036 - cluster/prob_snapshot/cluster_29:0.014862153526045925 - cluster/prob_snapshot/cluster_30:0.014862153526045925 - cluster/prob_snapshot/cluster_31:0.01709147655495281 - cluster/prob_snapshot/cluster_32:0.012632830497139036 - cluster/prob_snapshot/cluster_33:0.012632830497139036 - cluster/prob_snapshot/cluster_34:0.014862153526045925 - cluster/prob_snapshot/cluster_35:0.014862153526045925 - cluster/prob_snapshot/cluster_36:0.01709147655495281 - cluster/prob_snapshot/cluster_37:0.014862153526045925 - cluster/prob_snapshot/cluster_38:0.014862153526045925 - cluster/prob_snapshot/cluster_39:0.018652002675187632 - cluster/prob_snapshot/cluster_40:0.014862153526045925 - cluster/prob_snapshot/cluster_41:0.014862153526045925 - cluster/prob_snapshot/cluster_42:0.012632830497139036 - cluster/prob_snapshot/cluster_43:0.014862153526045925 - cluster/prob_snapshot/cluster_44:0.014862153526045925 - cluster/prob_snapshot/cluster_45:0.014862153526045925 - cluster/prob_snapshot/cluster_46:0.018652002675187632 - cluster/prob_snapshot/cluster_47:0.014862153526045925 - cluster/prob_snapshot/cluster_48:0.01709147655495281 - cluster/prob_snapshot/cluster_49:0.014862153526045925 - cluster/prob_snapshot/cluster_50:0.014862153526045925 - cluster/prob_snapshot/cluster_51:0.014862153526045925 - cluster/prob_snapshot/cluster_52:0.014862153526045925 - cluster/prob_snapshot/cluster_53:0.014862153526045925 - cluster/prob_snapshot/cluster_54:0.014862153526045925 - cluster/prob_snapshot/cluster_55:0.014862153526045925 - cluster/prob_snapshot/cluster_56:0.01709147655495281 - cluster/prob_snapshot/cluster_57:0.014862153526045925 - cluster/prob_snapshot/cluster_58:0.01709147655495281 - cluster/prob_snapshot/cluster_59:0.014862153526045925 - cluster/prob_snapshot/cluster_60:0.012632830497139036 - cluster/prob_snapshot/cluster_61:0.014862153526045925 - cluster/prob_snapshot/cluster_62:0.01709147655495281 - cluster/prob_snapshot/cluster_63:0.012632830497139036
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 226/800 [7:18:40<25:04:22, 157.25s/it]
[36m(TaskRunner pid=2823680)[0m step:226 - global_seqlen/min:366332 - global_seqlen/max:533907 - global_seqlen/minmax_diff:167575 - global_seqlen/balanced_min:444200 - global_seqlen/balanced_max:444371 - global_seqlen/mean:444316.5 - frontier/skipped_zero_acc_count:51.0 - actor/entropy:np.float64(0.21973243943200663) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011686294339597225 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04298688776907511) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00038783099500086304) - actor/ppo_kl:np.float64(3.86190265777632e-05) - actor/pg_clipfrac_lower:np.float64(1.019170418331692e-05) - actor/grad_norm:np.float64(0.2604148849844933) - perf/mfu/actor:np.float64(0.2987096015347741) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(145.99613571166992) - actor/lr:np.float64(1e-06) - training/global_step:226 - training/epoch:0 - critic/score/mean:0.5974025726318359 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5879497528076172 - critic/rewards/max:1.0220144987106323 - critic/rewards/min:-0.04664847254753113 - critic/advantages/mean:-0.18093794584274292 - critic/advantages/max:2.4748122692108154 - critic/advantages/min:-2.4748575687408447 - critic/returns/mean:-0.18093794584274292 - critic/returns/max:2.4748122692108154 - critic/returns/min:-2.4748575687408447 - response_length/mean:1227.082763671875 - response_length/max:8192.0 - response_length/min:281.0 - response_length/clip_ratio:0.01461038924753666 - response_length_non_aborted/mean:1227.082763671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:281.0 - response_length_non_aborted/clip_ratio:0.01461038924753666 - response/aborted_ratio:0.0 - prompt_length/mean:241.81817626953125 - prompt_length/max:386.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010112859308719635 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7080437485128641) - timing_s/agent_loop/generate_sequences/max:np.float64(32.09278308879584) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.895506129229034) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.09278308879584) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.93545556347817 - timing_s/reward:0.0008552083745598793 - timing_s/old_log_prob:9.292812143452466 - timing_s/ref:19.469317900016904 - timing_s/adv:0.08231073245406151 - timing_s/update_actor:17.75016583222896 - timing_s/update_weights:26.50764797627926 - timing_s/step:107.51123084034771 - timing_s/stop_profile:4.852563142776489e-05 - timing_per_token_ms/adv:9.096686657692164e-05 - timing_per_token_ms/update_actor:0.01961684605199903 - timing_per_token_ms/gen:0.04489511678854819 - timing_per_token_ms/ref:0.02151679120026005 - perf/total_num_tokens:1777266 - perf/time_per_step:107.51123084034771 - perf/throughput:4132.744984194276 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:113.0 - frontier/mean_score:2.1175625 - frontier/mean_frontier_pct:0.03999751797235769 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.51 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.3 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.9299999999999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.9299999999999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.51 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:3.11 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:226.0 - cluster/prob_snapshot/cluster_0:0.014757533720964552 - cluster/prob_snapshot/cluster_1:0.014757533720964552 - cluster/prob_snapshot/cluster_2:0.014757533720964552 - cluster/prob_snapshot/cluster_3:0.014757533720964552 - cluster/prob_snapshot/cluster_4:0.012543903662819869 - cluster/prob_snapshot/cluster_5:0.016971163779109233 - cluster/prob_snapshot/cluster_6:0.014757533720964552 - cluster/prob_snapshot/cluster_7:0.01852070481981051 - cluster/prob_snapshot/cluster_8:0.016971163779109233 - cluster/prob_snapshot/cluster_9:0.014757533720964552 - cluster/prob_snapshot/cluster_10:0.014757533720964552 - cluster/prob_snapshot/cluster_11:0.016971163779109233 - cluster/prob_snapshot/cluster_12:0.021619786901213068 - cluster/prob_snapshot/cluster_13:0.016971163779109233 - cluster/prob_snapshot/cluster_14:0.01852070481981051 - cluster/prob_snapshot/cluster_15:0.016971163779109233 - cluster/prob_snapshot/cluster_16:0.01852070481981051 - cluster/prob_snapshot/cluster_17:0.016971163779109233 - cluster/prob_snapshot/cluster_18:0.014757533720964552 - cluster/prob_snapshot/cluster_19:0.014757533720964552 - cluster/prob_snapshot/cluster_20:0.021619786901213068 - cluster/prob_snapshot/cluster_21:0.01852070481981051 - cluster/prob_snapshot/cluster_22:0.014757533720964552 - cluster/prob_snapshot/cluster_23:0.019605383548301405 - cluster/prob_snapshot/cluster_24:0.016971163779109233 - cluster/prob_snapshot/cluster_25:0.012543903662819869 - cluster/prob_snapshot/cluster_26:0.014757533720964552 - cluster/prob_snapshot/cluster_27:0.016971163779109233 - cluster/prob_snapshot/cluster_28:0.012543903662819869 - cluster/prob_snapshot/cluster_29:0.012543903662819869 - cluster/prob_snapshot/cluster_30:0.014757533720964552 - cluster/prob_snapshot/cluster_31:0.02294796493609988 - cluster/prob_snapshot/cluster_32:0.012543903662819869 - cluster/prob_snapshot/cluster_33:0.012543903662819869 - cluster/prob_snapshot/cluster_34:0.014757533720964552 - cluster/prob_snapshot/cluster_35:0.014757533720964552 - cluster/prob_snapshot/cluster_36:0.016971163779109233 - cluster/prob_snapshot/cluster_37:0.014757533720964552 - cluster/prob_snapshot/cluster_38:0.014757533720964552 - cluster/prob_snapshot/cluster_39:0.01852070481981051 - cluster/prob_snapshot/cluster_40:0.014757533720964552 - cluster/prob_snapshot/cluster_41:0.016971163779109233 - cluster/prob_snapshot/cluster_42:0.010994362622118592 - cluster/prob_snapshot/cluster_43:0.014757533720964552 - cluster/prob_snapshot/cluster_44:0.012543903662819869 - cluster/prob_snapshot/cluster_45:0.014757533720964552 - cluster/prob_snapshot/cluster_46:0.019605383548301405 - cluster/prob_snapshot/cluster_47:0.014757533720964552 - cluster/prob_snapshot/cluster_48:0.016971163779109233 - cluster/prob_snapshot/cluster_49:0.014757533720964552 - cluster/prob_snapshot/cluster_50:0.012543903662819869 - cluster/prob_snapshot/cluster_51:0.014757533720964552 - cluster/prob_snapshot/cluster_52:0.014757533720964552 - cluster/prob_snapshot/cluster_53:0.014757533720964552 - cluster/prob_snapshot/cluster_54:0.014757533720964552 - cluster/prob_snapshot/cluster_55:0.014757533720964552 - cluster/prob_snapshot/cluster_56:0.016971163779109233 - cluster/prob_snapshot/cluster_57:0.014757533720964552 - cluster/prob_snapshot/cluster_58:0.016971163779109233 - cluster/prob_snapshot/cluster_59:0.014757533720964552 - cluster/prob_snapshot/cluster_60:0.012543903662819869 - cluster/prob_snapshot/cluster_61:0.014757533720964552 - cluster/prob_snapshot/cluster_62:0.016971163779109233 - cluster/prob_snapshot/cluster_63:0.012543903662819869
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 18:51:04,051:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 227/800 [7:20:44<23:27:06, 147.34s/it]
[36m(TaskRunner pid=2823680)[0m step:227 - global_seqlen/min:397996 - global_seqlen/max:530935 - global_seqlen/minmax_diff:132939 - global_seqlen/balanced_min:455982 - global_seqlen/balanced_max:456120 - global_seqlen/mean:456047.5 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.21883376995506493) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011067104525864124 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07691083809186239) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003994572211329638) - actor/ppo_kl:np.float64(5.7560466986649814e-05) - actor/pg_clipfrac_lower:np.float64(2.660965413309421e-06) - actor/grad_norm:np.float64(0.26690304403503734) - perf/mfu/actor:np.float64(0.24625918215645948) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(146.61573791503906) - actor/lr:np.float64(1e-06) - training/global_step:227 - training/epoch:0 - critic/score/mean:0.5625 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5527612566947937 - critic/rewards/max:1.024669885635376 - critic/rewards/min:-0.08868677914142609 - critic/advantages/mean:-0.13023346662521362 - critic/advantages/max:2.4748406410217285 - critic/advantages/min:-2.4748175144195557 - critic/returns/mean:-0.13023346662521362 - critic/returns/max:2.4748406410217285 - critic/returns/min:-2.4748175144195557 - response_length/mean:1400.358642578125 - response_length/max:8192.0 - response_length/min:247.0 - response_length/clip_ratio:0.019021738320589066 - response_length_non_aborted/mean:1400.358642578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:247.0 - response_length_non_aborted/clip_ratio:0.019021738320589066 - response/aborted_ratio:0.0 - prompt_length/mean:243.43478393554688 - prompt_length/max:380.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.880160748958588e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.9906809162348509) - timing_s/agent_loop/generate_sequences/max:np.float64(31.776559315621853) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.345751491320698) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.776559315621853) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:199 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.436769399791956 - timing_s/reward:0.00018788501620292664 - timing_s/old_log_prob:10.377917908132076 - timing_s/ref:24.702864300459623 - timing_s/adv:0.0957174189388752 - timing_s/update_actor:21.936493668705225 - timing_s/update_weights:32.92635930608958 - timing_s/step:123.91397934220731 - timing_s/stop_profile:6.278418004512787e-05 - timing_per_token_ms/adv:7.911628964920353e-05 - timing_per_token_ms/update_actor:0.018131851090651616 - timing_per_token_ms/gen:0.0324419688664705 - timing_per_token_ms/ref:0.020418425285874092 - perf/total_num_tokens:1824190 - perf/time_per_step:123.91397934220731 - perf/throughput:3680.3555371307657 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:149.0 - frontier/mean_score:2.1121999999999996 - frontier/mean_frontier_pct:0.05421220730804483 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.51 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.9299999999999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.3 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.6569999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.9299999999999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.1598999999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:1.7 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:3.11 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.3598999999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:227.0 - cluster/prob_snapshot/cluster_0:0.014795000473440017 - cluster/prob_snapshot/cluster_1:0.014795000473440017 - cluster/prob_snapshot/cluster_2:0.012575750402424014 - cluster/prob_snapshot/cluster_3:0.014795000473440017 - cluster/prob_snapshot/cluster_4:0.012575750402424014 - cluster/prob_snapshot/cluster_5:0.01701425054445602 - cluster/prob_snapshot/cluster_6:0.014795000473440017 - cluster/prob_snapshot/cluster_7:0.01856772559416722 - cluster/prob_snapshot/cluster_8:0.01856772559416722 - cluster/prob_snapshot/cluster_9:0.014795000473440017 - cluster/prob_snapshot/cluster_10:0.014795000473440017 - cluster/prob_snapshot/cluster_11:0.01701425054445602 - cluster/prob_snapshot/cluster_12:0.021674675693589624 - cluster/prob_snapshot/cluster_13:0.01701425054445602 - cluster/prob_snapshot/cluster_14:0.01965515812896506 - cluster/prob_snapshot/cluster_15:0.01701425054445602 - cluster/prob_snapshot/cluster_16:0.01965515812896506 - cluster/prob_snapshot/cluster_17:0.01701425054445602 - cluster/prob_snapshot/cluster_18:0.014795000473440017 - cluster/prob_snapshot/cluster_19:0.014795000473440017 - cluster/prob_snapshot/cluster_20:0.021674675693589624 - cluster/prob_snapshot/cluster_21:0.01965515812896506 - cluster/prob_snapshot/cluster_22:0.014795000473440017 - cluster/prob_snapshot/cluster_23:0.015977860761291544 - cluster/prob_snapshot/cluster_24:0.01701425054445602 - cluster/prob_snapshot/cluster_25:0.011022275352712814 - cluster/prob_snapshot/cluster_26:0.014795000473440017 - cluster/prob_snapshot/cluster_27:0.01856772559416722 - cluster/prob_snapshot/cluster_28:0.012575750402424014 - cluster/prob_snapshot/cluster_29:0.012575750402424014 - cluster/prob_snapshot/cluster_30:0.012575750402424014 - cluster/prob_snapshot/cluster_31:0.023006225736199228 - cluster/prob_snapshot/cluster_32:0.011022275352712814 - cluster/prob_snapshot/cluster_33:0.012575750402424014 - cluster/prob_snapshot/cluster_34:0.014795000473440017 - cluster/prob_snapshot/cluster_35:0.014795000473440017 - cluster/prob_snapshot/cluster_36:0.014129225452135217 - cluster/prob_snapshot/cluster_37:0.014795000473440017 - cluster/prob_snapshot/cluster_38:0.014795000473440017 - cluster/prob_snapshot/cluster_39:0.01856772559416722 - cluster/prob_snapshot/cluster_40:0.014795000473440017 - cluster/prob_snapshot/cluster_41:0.01701425054445602 - cluster/prob_snapshot/cluster_42:0.011022275352712814 - cluster/prob_snapshot/cluster_43:0.014795000473440017 - cluster/prob_snapshot/cluster_44:0.011022275352712814 - cluster/prob_snapshot/cluster_45:0.012575750402424014 - cluster/prob_snapshot/cluster_46:0.024854861045355555 - cluster/prob_snapshot/cluster_47:0.014795000473440017 - cluster/prob_snapshot/cluster_48:0.01856772559416722 - cluster/prob_snapshot/cluster_49:0.014795000473440017 - cluster/prob_snapshot/cluster_50:0.012575750402424014 - cluster/prob_snapshot/cluster_51:0.014795000473440017 - cluster/prob_snapshot/cluster_52:0.014795000473440017 - cluster/prob_snapshot/cluster_53:0.014795000473440017 - cluster/prob_snapshot/cluster_54:0.014795000473440017 - cluster/prob_snapshot/cluster_55:0.01701425054445602 - cluster/prob_snapshot/cluster_56:0.01701425054445602 - cluster/prob_snapshot/cluster_57:0.014795000473440017 - cluster/prob_snapshot/cluster_58:0.01701425054445602 - cluster/prob_snapshot/cluster_59:0.014795000473440017 - cluster/prob_snapshot/cluster_60:0.012575750402424014 - cluster/prob_snapshot/cluster_61:0.014795000473440017 - cluster/prob_snapshot/cluster_62:0.01701425054445602 - cluster/prob_snapshot/cluster_63:0.012575750402424014
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 18:53:12,730:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  28%|██▊       | 228/800 [7:22:34<21:37:51, 136.14s/it]
[36m(TaskRunner pid=2823680)[0m step:228 - global_seqlen/min:345847 - global_seqlen/max:417062 - global_seqlen/minmax_diff:71215 - global_seqlen/balanced_min:390060 - global_seqlen/balanced_max:390308 - global_seqlen/mean:390150.25 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.19614066846449585) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011835341341793537 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08011364736012183) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00037880314894388065) - actor/ppo_kl:np.float64(5.571235368701728e-05) - actor/pg_clipfrac_lower:np.float64(3.0305423291899977e-06) - actor/grad_norm:np.float64(0.24983904797297257) - perf/mfu/actor:np.float64(0.19209342918332206) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(168.2702693939209) - actor/lr:np.float64(1e-06) - training/global_step:228 - training/epoch:0 - critic/score/mean:0.6213235259056091 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.612163245677948 - critic/rewards/max:1.0339140892028809 - critic/rewards/min:-0.07094258069992065 - critic/advantages/mean:-0.13012126088142395 - critic/advantages/max:2.4747860431671143 - critic/advantages/min:-2.4748475551605225 - critic/returns/mean:-0.13012126088142395 - critic/returns/max:2.4747860431671143 - critic/returns/min:-2.4748475551605225 - response_length/mean:1167.4547119140625 - response_length/max:8192.0 - response_length/min:236.0 - response_length/clip_ratio:0.013480392284691334 - response_length_non_aborted/mean:1167.4547119140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:236.0 - response_length_non_aborted/clip_ratio:0.013480392284691334 - response/aborted_ratio:0.0 - prompt_length/mean:240.5098114013672 - prompt_length/max:416.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.493343532085419e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8928682450205088) - timing_s/agent_loop/generate_sequences/max:np.float64(29.097716940566897) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.922178565018839) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.097716940566897) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.882317163050175 - timing_s/reward:0.0001506255939602852 - timing_s/old_log_prob:11.30981619656086 - timing_s/ref:15.737203674390912 - timing_s/adv:0.10175994504243135 - timing_s/update_actor:24.501000260934234 - timing_s/update_weights:25.28059098497033 - timing_s/step:108.2576196771115 - timing_s/stop_profile:5.563255399465561e-05 - timing_per_token_ms/adv:8.857170651417692e-05 - timing_per_token_ms/update_actor:0.021325634595324945 - timing_per_token_ms/gen:0.03241751334240652 - timing_per_token_ms/ref:0.013697638934659105 - perf/total_num_tokens:1560601 - perf/time_per_step:108.2576196771115 - perf/throughput:3603.905675772843 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:175.0 - frontier/mean_score:2.1781067187499996 - frontier/mean_frontier_pct:0.06379998100886097 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:15.0 - frontier/batch_hard_count:0.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:1.7 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.51 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.9509999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.51 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.6569999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:3.5509999999999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.4119299999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.09 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.09 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:3.0769999999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:1.7 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.3 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:1.91 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.3 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.3598999999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.3 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.51 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:228.0 - cluster/prob_snapshot/cluster_0:0.014347322714258077 - cluster/prob_snapshot/cluster_1:0.014347322714258077 - cluster/prob_snapshot/cluster_2:0.012195224307119366 - cluster/prob_snapshot/cluster_3:0.014347322714258077 - cluster/prob_snapshot/cluster_4:0.012195224307119366 - cluster/prob_snapshot/cluster_5:0.01649942112139679 - cluster/prob_snapshot/cluster_6:0.014347322714258077 - cluster/prob_snapshot/cluster_7:0.018005890006393886 - cluster/prob_snapshot/cluster_8:0.018005890006393886 - cluster/prob_snapshot/cluster_9:0.01649942112139679 - cluster/prob_snapshot/cluster_10:0.014347322714258077 - cluster/prob_snapshot/cluster_11:0.01649942112139679 - cluster/prob_snapshot/cluster_12:0.02116947466488779 - cluster/prob_snapshot/cluster_13:0.018005890006393886 - cluster/prob_snapshot/cluster_14:0.019060418225891853 - cluster/prob_snapshot/cluster_15:0.01649942112139679 - cluster/prob_snapshot/cluster_16:0.019060418225891853 - cluster/prob_snapshot/cluster_17:0.01649942112139679 - cluster/prob_snapshot/cluster_18:0.014347322714258077 - cluster/prob_snapshot/cluster_19:0.014347322714258077 - cluster/prob_snapshot/cluster_20:0.025473671479165214 - cluster/prob_snapshot/cluster_21:0.019060418225891853 - cluster/prob_snapshot/cluster_22:0.014347322714258077 - cluster/prob_snapshot/cluster_23:0.017302369037100238 - cluster/prob_snapshot/cluster_24:0.01649942112139679 - cluster/prob_snapshot/cluster_25:0.010688755422122267 - cluster/prob_snapshot/cluster_26:0.014347322714258077 - cluster/prob_snapshot/cluster_27:0.018005890006393886 - cluster/prob_snapshot/cluster_28:0.01499295223639969 - cluster/prob_snapshot/cluster_29:0.012195224307119366 - cluster/prob_snapshot/cluster_30:0.01499295223639969 - cluster/prob_snapshot/cluster_31:0.02207335599588605 - cluster/prob_snapshot/cluster_32:0.010688755422122267 - cluster/prob_snapshot/cluster_33:0.012195224307119366 - cluster/prob_snapshot/cluster_34:0.014347322714258077 - cluster/prob_snapshot/cluster_35:0.01649942112139679 - cluster/prob_snapshot/cluster_36:0.013701693192116464 - cluster/prob_snapshot/cluster_37:0.01649942112139679 - cluster/prob_snapshot/cluster_38:0.014347322714258077 - cluster/prob_snapshot/cluster_39:0.019060418225891853 - cluster/prob_snapshot/cluster_40:0.014347322714258077 - cluster/prob_snapshot/cluster_41:0.018005890006393886 - cluster/prob_snapshot/cluster_42:0.010688755422122267 - cluster/prob_snapshot/cluster_43:0.014347322714258077 - cluster/prob_snapshot/cluster_44:0.010688755422122267 - cluster/prob_snapshot/cluster_45:0.012195224307119366 - cluster/prob_snapshot/cluster_46:0.024102784793817854 - cluster/prob_snapshot/cluster_47:0.01649942112139679 - cluster/prob_snapshot/cluster_48:0.018005890006393886 - cluster/prob_snapshot/cluster_49:0.01649942112139679 - cluster/prob_snapshot/cluster_50:0.012195224307119366 - cluster/prob_snapshot/cluster_51:0.014347322714258077 - cluster/prob_snapshot/cluster_52:0.014347322714258077 - cluster/prob_snapshot/cluster_53:0.014347322714258077 - cluster/prob_snapshot/cluster_54:0.01649942112139679 - cluster/prob_snapshot/cluster_55:0.01649942112139679 - cluster/prob_snapshot/cluster_56:0.01649942112139679 - cluster/prob_snapshot/cluster_57:0.014347322714258077 - cluster/prob_snapshot/cluster_58:0.018005890006393886 - cluster/prob_snapshot/cluster_59:0.014347322714258077 - cluster/prob_snapshot/cluster_60:0.012195224307119366 - cluster/prob_snapshot/cluster_61:0.014347322714258077 - cluster/prob_snapshot/cluster_62:0.01649942112139679 - cluster/prob_snapshot/cluster_63:0.012195224307119366
[36m(TaskRunner pid=2823680)[0m Training Progress:  29%|██▊       | 229/800 [7:24:40<21:05:07, 132.94s/it]
[36m(TaskRunner pid=2823680)[0m step:229 - global_seqlen/min:360781 - global_seqlen/max:472792 - global_seqlen/minmax_diff:112011 - global_seqlen/balanced_min:411908 - global_seqlen/balanced_max:411949 - global_seqlen/mean:411920.0 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.18062108924941106) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011124216951429844 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09049885702552274) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004068483285664115) - actor/ppo_kl:np.float64(1.7863932789768398e-05) - actor/pg_clipfrac_lower:np.float64(5.459132558362576e-06) - actor/grad_norm:np.float64(0.24605929507659033) - perf/mfu/actor:np.float64(0.2117111716165002) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(114.27324295043945) - actor/lr:np.float64(1e-06) - training/global_step:229 - training/epoch:0 - critic/score/mean:0.6089109182357788 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.60118567943573 - critic/rewards/max:1.011866569519043 - critic/rewards/min:-0.09658581018447876 - critic/advantages/mean:-0.19214065372943878 - critic/advantages/max:2.4748284816741943 - critic/advantages/min:-2.4748542308807373 - critic/returns/mean:-0.19214065372943878 - critic/returns/max:2.4748284816741943 - critic/returns/min:-2.4748542308807373 - response_length/mean:1300.6522216796875 - response_length/max:8192.0 - response_length/min:166.0 - response_length/clip_ratio:0.021039603278040886 - response_length_non_aborted/mean:1300.6522216796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:166.0 - response_length_non_aborted/clip_ratio:0.021039603278040886 - response/aborted_ratio:0.0 - prompt_length/mean:239.4158477783203 - prompt_length/max:416.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.815526962280273e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7589527918025851) - timing_s/agent_loop/generate_sequences/max:np.float64(30.008768630214036) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.696244542113163) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.008768630214036) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:201 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.96213209815323 - timing_s/reward:0.0001636594533920288 - timing_s/old_log_prob:15.538819492794573 - timing_s/ref:22.067555772140622 - timing_s/adv:0.0910769235342741 - timing_s/update_actor:23.074688765220344 - timing_s/update_weights:31.036026949994266 - timing_s/step:125.19532395526767 - timing_s/stop_profile:5.334243178367615e-05 - timing_per_token_ms/adv:7.319089786782449e-05 - timing_per_token_ms/update_actor:0.018543195391437745 - timing_per_token_ms/gen:0.0313648161082104 - timing_per_token_ms/ref:0.017733846928892513 - perf/total_num_tokens:1647680 - perf/time_per_step:125.19532395526767 - perf/throughput:3290.2187317090147 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:202.0 - frontier/mean_score:2.25156921875 - frontier/mean_frontier_pct:0.07421804756105552 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.9 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.3 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.6569999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.9656999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.6569999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.6569999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:3.5509999999999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.4119299999999995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:2.51 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.09 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.3629999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:3.0769999999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.09 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.237 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.3 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.3598999999999997 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.3 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.51 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:229.0 - cluster/prob_snapshot/cluster_0:0.020124853201340206 - cluster/prob_snapshot/cluster_1:0.015961090470028437 - cluster/prob_snapshot/cluster_2:0.011797327738716673 - cluster/prob_snapshot/cluster_3:0.013879209104372556 - cluster/prob_snapshot/cluster_4:0.01450377351406932 - cluster/prob_snapshot/cluster_5:0.015961090470028437 - cluster/prob_snapshot/cluster_6:0.013879209104372556 - cluster/prob_snapshot/cluster_7:0.018438529295158938 - cluster/prob_snapshot/cluster_8:0.017418407425987555 - cluster/prob_snapshot/cluster_9:0.015961090470028437 - cluster/prob_snapshot/cluster_10:0.013879209104372556 - cluster/prob_snapshot/cluster_11:0.015961090470028437 - cluster/prob_snapshot/cluster_12:0.02058078522041884 - cluster/prob_snapshot/cluster_13:0.018438529295158938 - cluster/prob_snapshot/cluster_14:0.019152614603578904 - cluster/prob_snapshot/cluster_15:0.015961090470028437 - cluster/prob_snapshot/cluster_16:0.018438529295158938 - cluster/prob_snapshot/cluster_17:0.015961090470028437 - cluster/prob_snapshot/cluster_18:0.015961090470028437 - cluster/prob_snapshot/cluster_19:0.013879209104372556 - cluster/prob_snapshot/cluster_20:0.02464253576481347 - cluster/prob_snapshot/cluster_21:0.018438529295158938 - cluster/prob_snapshot/cluster_22:0.013879209104372556 - cluster/prob_snapshot/cluster_23:0.016737840407554647 - cluster/prob_snapshot/cluster_24:0.015961090470028437 - cluster/prob_snapshot/cluster_25:0.010340010782757553 - cluster/prob_snapshot/cluster_26:0.013879209104372556 - cluster/prob_snapshot/cluster_27:0.017418407425987555 - cluster/prob_snapshot/cluster_28:0.01450377351406932 - cluster/prob_snapshot/cluster_29:0.011797327738716673 - cluster/prob_snapshot/cluster_30:0.01639828555681617 - cluster/prob_snapshot/cluster_31:0.021353163207077173 - cluster/prob_snapshot/cluster_32:0.010340010782757553 - cluster/prob_snapshot/cluster_33:0.01450377351406932 - cluster/prob_snapshot/cluster_34:0.013879209104372556 - cluster/prob_snapshot/cluster_35:0.021582170157299324 - cluster/prob_snapshot/cluster_36:0.015523895383240704 - cluster/prob_snapshot/cluster_37:0.015961090470028437 - cluster/prob_snapshot/cluster_38:0.015961090470028437 - cluster/prob_snapshot/cluster_39:0.018438529295158938 - cluster/prob_snapshot/cluster_40:0.013879209104372556 - cluster/prob_snapshot/cluster_41:0.017418407425987555 - cluster/prob_snapshot/cluster_42:0.010340010782757553 - cluster/prob_snapshot/cluster_43:0.013879209104372556 - cluster/prob_snapshot/cluster_44:0.010340010782757553 - cluster/prob_snapshot/cluster_45:0.011797327738716673 - cluster/prob_snapshot/cluster_46:0.023316377334890673 - cluster/prob_snapshot/cluster_47:0.015961090470028437 - cluster/prob_snapshot/cluster_48:0.017418407425987555 - cluster/prob_snapshot/cluster_49:0.015961090470028437 - cluster/prob_snapshot/cluster_50:0.010340010782757553 - cluster/prob_snapshot/cluster_51:0.013879209104372556 - cluster/prob_snapshot/cluster_52:0.013879209104372556 - cluster/prob_snapshot/cluster_53:0.013879209104372556 - cluster/prob_snapshot/cluster_54:0.015961090470028437 - cluster/prob_snapshot/cluster_55:0.015961090470028437 - cluster/prob_snapshot/cluster_56:0.015961090470028437 - cluster/prob_snapshot/cluster_57:0.013879209104372556 - cluster/prob_snapshot/cluster_58:0.017418407425987555 - cluster/prob_snapshot/cluster_59:0.015961090470028437 - cluster/prob_snapshot/cluster_60:0.011797327738716673 - cluster/prob_snapshot/cluster_61:0.013879209104372556 - cluster/prob_snapshot/cluster_62:0.017418407425987555 - cluster/prob_snapshot/cluster_63:0.011797327738716673
[36m(TaskRunner pid=2823680)[0m Training Progress:  29%|██▉       | 230/800 [7:26:27<19:49:06, 125.17s/it]
[36m(TaskRunner pid=2823680)[0m step:230 - global_seqlen/min:316322 - global_seqlen/max:466169 - global_seqlen/minmax_diff:149847 - global_seqlen/balanced_min:380998 - global_seqlen/balanced_max:381203 - global_seqlen/mean:381096.75 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.2225455550269948) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012988787144422531 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.023056723235640675) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006051902030271271) - actor/ppo_kl:np.float64(1.5047703433089686e-05) - actor/pg_clipfrac_lower:np.float64(2.0761101849428896e-06) - actor/grad_norm:np.float64(0.2511276490986347) - perf/mfu/actor:np.float64(0.23255982793454952) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(107.45601272583008) - actor/lr:np.float64(1e-06) - training/global_step:230 - training/epoch:0 - critic/score/mean:0.5916666388511658 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5819234251976013 - critic/rewards/max:1.0495749711990356 - critic/rewards/min:-0.05511148273944855 - critic/advantages/mean:-0.12173452973365784 - critic/advantages/max:2.474738359451294 - critic/advantages/min:-2.474848508834839 - critic/returns/mean:-0.12173452973365784 - critic/returns/max:2.474738359451294 - critic/returns/min:-2.474848508834839 - response_length/mean:1089.3736572265625 - response_length/max:8192.0 - response_length/min:196.0 - response_length/clip_ratio:0.011111111380159855 - response_length_non_aborted/mean:1089.3736572265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:196.0 - response_length_non_aborted/clip_ratio:0.011111111380159855 - response/aborted_ratio:0.0 - prompt_length/mean:233.0111083984375 - prompt_length/max:344.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.606072515249252e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.523746570572257) - timing_s/agent_loop/generate_sequences/max:np.float64(28.830164259299636) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.777762352829086) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.830164259299636) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.721164452843368 - timing_s/reward:0.0001437738537788391 - timing_s/old_log_prob:9.37692280113697 - timing_s/ref:20.77383506577462 - timing_s/adv:0.06651504803448915 - timing_s/update_actor:19.298427955247462 - timing_s/update_weights:26.117680991068482 - timing_s/step:106.78177530597895 - timing_s/stop_profile:4.857126623392105e-05 - timing_per_token_ms/adv:6.986016218016185e-05 - timing_per_token_ms/update_actor:0.02026896689718539 - timing_per_token_ms/gen:0.03916772310902847 - timing_per_token_ms/ref:0.02181857383680222 - perf/total_num_tokens:1524387 - perf/time_per_step:106.78177530597895 - perf/throughput:3568.930642967701 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:240.0 - frontier/mean_score:2.2448715781249997 - frontier/mean_frontier_pct:0.09857930182569569 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.9299999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.3 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.6569999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.51 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.9656999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.6569999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:3.5509999999999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.0569999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.09 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9540999999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:3.0769999999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.3629999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:1.8659000000000001 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.51 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.2519299999999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.51 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.7 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.51 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:1.7 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:230.0 - cluster/prob_snapshot/cluster_0:0.020393705566996486 - cluster/prob_snapshot/cluster_1:0.01600871085463888 - cluster/prob_snapshot/cluster_2:0.011832525414298304 - cluster/prob_snapshot/cluster_3:0.013920618134468593 - cluster/prob_snapshot/cluster_4:0.014547045950519678 - cluster/prob_snapshot/cluster_5:0.01600871085463888 - cluster/prob_snapshot/cluster_6:0.013920618134468593 - cluster/prob_snapshot/cluster_7:0.018493541191641524 - cluster/prob_snapshot/cluster_8:0.017470375758758084 - cluster/prob_snapshot/cluster_9:0.01600871085463888 - cluster/prob_snapshot/cluster_10:0.013920618134468593 - cluster/prob_snapshot/cluster_11:0.01600871085463888 - cluster/prob_snapshot/cluster_12:0.02064218860069675 - cluster/prob_snapshot/cluster_13:0.019209756994659932 - cluster/prob_snapshot/cluster_14:0.019209756994659932 - cluster/prob_snapshot/cluster_15:0.01600871085463888 - cluster/prob_snapshot/cluster_16:0.018493541191641524 - cluster/prob_snapshot/cluster_17:0.017470375758758084 - cluster/prob_snapshot/cluster_18:0.01600871085463888 - cluster/prob_snapshot/cluster_19:0.01600871085463888 - cluster/prob_snapshot/cluster_20:0.024716057497748984 - cluster/prob_snapshot/cluster_21:0.018493541191641524 - cluster/prob_snapshot/cluster_22:0.011832525414298304 - cluster/prob_snapshot/cluster_23:0.018015722934484956 - cluster/prob_snapshot/cluster_24:0.01600871085463888 - cluster/prob_snapshot/cluster_25:0.010370860510179101 - cluster/prob_snapshot/cluster_26:0.013920618134468593 - cluster/prob_snapshot/cluster_27:0.014317355751300944 - cluster/prob_snapshot/cluster_28:0.014547045950519678 - cluster/prob_snapshot/cluster_29:0.011832525414298304 - cluster/prob_snapshot/cluster_30:0.013601139948282538 - cluster/prob_snapshot/cluster_31:0.021416870999879926 - cluster/prob_snapshot/cluster_32:0.010370860510179101 - cluster/prob_snapshot/cluster_33:0.01644721032587464 - cluster/prob_snapshot/cluster_34:0.013920618134468593 - cluster/prob_snapshot/cluster_35:0.021646561199098663 - cluster/prob_snapshot/cluster_36:0.012987240688552474 - cluster/prob_snapshot/cluster_37:0.017470375758758084 - cluster/prob_snapshot/cluster_38:0.01600871085463888 - cluster/prob_snapshot/cluster_39:0.018493541191641524 - cluster/prob_snapshot/cluster_40:0.013920618134468593 - cluster/prob_snapshot/cluster_41:0.017470375758758084 - cluster/prob_snapshot/cluster_42:0.010370860510179101 - cluster/prob_snapshot/cluster_43:0.013920618134468593 - cluster/prob_snapshot/cluster_44:0.010370860510179101 - cluster/prob_snapshot/cluster_45:0.011832525414298304 - cluster/prob_snapshot/cluster_46:0.022634437865011223 - cluster/prob_snapshot/cluster_47:0.017470375758758084 - cluster/prob_snapshot/cluster_48:0.017470375758758084 - cluster/prob_snapshot/cluster_49:0.017470375758758084 - cluster/prob_snapshot/cluster_50:0.010370860510179101 - cluster/prob_snapshot/cluster_51:0.011832525414298304 - cluster/prob_snapshot/cluster_52:0.013920618134468593 - cluster/prob_snapshot/cluster_53:0.013920618134468593 - cluster/prob_snapshot/cluster_54:0.01600871085463888 - cluster/prob_snapshot/cluster_55:0.01600871085463888 - cluster/prob_snapshot/cluster_56:0.01600871085463888 - cluster/prob_snapshot/cluster_57:0.013920618134468593 - cluster/prob_snapshot/cluster_58:0.017470375758758084 - cluster/prob_snapshot/cluster_59:0.01600871085463888 - cluster/prob_snapshot/cluster_60:0.010370860510179101 - cluster/prob_snapshot/cluster_61:0.013920618134468593 - cluster/prob_snapshot/cluster_62:0.017470375758758084 - cluster/prob_snapshot/cluster_63:0.011832525414298304
[36m(TaskRunner pid=2823680)[0m Training Progress:  29%|██▉       | 231/800 [7:28:13<18:53:37, 119.54s/it]
[36m(TaskRunner pid=2823680)[0m step:231 - global_seqlen/min:295476 - global_seqlen/max:424491 - global_seqlen/minmax_diff:129015 - global_seqlen/balanced_min:353083 - global_seqlen/balanced_max:353363 - global_seqlen/mean:353207.75 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.2460754493708318) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013796917162835598 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07668608986205072) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005587039052460448) - actor/ppo_kl:np.float64(0.0001406415410232692) - actor/pg_clipfrac_lower:np.float64(5.845640634158011e-06) - actor/grad_norm:np.float64(0.25995334557124544) - perf/mfu/actor:np.float64(0.14876752126949694) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(109.24077987670898) - actor/lr:np.float64(1e-06) - training/global_step:231 - training/epoch:0 - critic/score/mean:0.648809552192688 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6398231387138367 - critic/rewards/max:1.0410288572311401 - critic/rewards/min:-0.054197825491428375 - critic/advantages/mean:-0.1907549351453781 - critic/advantages/max:2.4748456478118896 - critic/advantages/min:-2.4748451709747314 - critic/returns/mean:-0.1907549351453781 - critic/returns/max:2.4748456478118896 - critic/returns/min:-2.4748451709747314 - response_length/mean:1032.8582763671875 - response_length/max:8192.0 - response_length/min:173.0 - response_length/clip_ratio:0.015476190485060215 - response_length_non_aborted/mean:1032.8582763671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:173.0 - response_length_non_aborted/clip_ratio:0.015476190485060215 - response/aborted_ratio:0.0 - prompt_length/mean:242.91429138183594 - prompt_length/max:382.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.402112871408463e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4067941969260573) - timing_s/agent_loop/generate_sequences/max:np.float64(29.59497898630798) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.08334072722846) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.59497898630798) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:214 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.433615419082344 - timing_s/reward:0.00012497231364250183 - timing_s/old_log_prob:11.039830051362514 - timing_s/ref:9.593870613723993 - timing_s/adv:0.10896698944270611 - timing_s/update_actor:27.802735855802894 - timing_s/update_weights:25.739109214395285 - timing_s/step:106.14722589496523 - timing_s/stop_profile:0.0001107463613152504 - timing_per_token_ms/adv:0.00010168160418449148 - timing_per_token_ms/update_actor:0.025943882610633608 - timing_per_token_ms/gen:0.036230496990070714 - timing_per_token_ms/ref:0.008952437424682888 - perf/total_num_tokens:1412831 - perf/time_per_step:106.14722589496523 - perf/throughput:3327.526904466689 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:263.0 - frontier/mean_score:2.3044923593749997 - frontier/mean_frontier_pct:0.11987407385829499 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.5509999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.3 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.7598999999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.9656999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.7598999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.7598999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:3.3856999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:3.11 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.0569999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9540999999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:3.0769999999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.3629999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.2519299999999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:1.7 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.51 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.51 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.09 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:231.0 - cluster/prob_snapshot/cluster_0:0.024076614866732682 - cluster/prob_snapshot/cluster_1:0.015594540747250118 - cluster/prob_snapshot/cluster_2:0.011526399682750089 - cluster/prob_snapshot/cluster_3:0.013560470215000104 - cluster/prob_snapshot/cluster_4:0.014170691374675108 - cluster/prob_snapshot/cluster_5:0.015594540747250118 - cluster/prob_snapshot/cluster_6:0.015594540747250118 - cluster/prob_snapshot/cluster_7:0.018712770873189392 - cluster/prob_snapshot/cluster_8:0.018015084680627635 - cluster/prob_snapshot/cluster_9:0.015594540747250118 - cluster/prob_snapshot/cluster_10:0.01966268181175015 - cluster/prob_snapshot/cluster_11:0.015594540747250118 - cluster/prob_snapshot/cluster_12:0.0201081432583129 - cluster/prob_snapshot/cluster_13:0.018712770873189392 - cluster/prob_snapshot/cluster_14:0.018712770873189392 - cluster/prob_snapshot/cluster_15:0.015594540747250118 - cluster/prob_snapshot/cluster_16:0.018712770873189392 - cluster/prob_snapshot/cluster_17:0.01701839011982513 - cluster/prob_snapshot/cluster_18:0.015594540747250118 - cluster/prob_snapshot/cluster_19:0.015594540747250118 - cluster/prob_snapshot/cluster_20:0.02295584200346292 - cluster/prob_snapshot/cluster_21:0.018015084680627635 - cluster/prob_snapshot/cluster_22:0.011526399682750089 - cluster/prob_snapshot/cluster_23:0.01754962832073286 - cluster/prob_snapshot/cluster_24:0.02108653118432516 - cluster/prob_snapshot/cluster_25:0.010102550310175077 - cluster/prob_snapshot/cluster_26:0.013560470215000104 - cluster/prob_snapshot/cluster_27:0.013946943616127604 - cluster/prob_snapshot/cluster_28:0.01602169555902262 - cluster/prob_snapshot/cluster_29:0.010102550310175077 - cluster/prob_snapshot/cluster_30:0.01324925742356585 - cluster/prob_snapshot/cluster_31:0.020862783425777656 - cluster/prob_snapshot/cluster_32:0.010102550310175077 - cluster/prob_snapshot/cluster_33:0.01602169555902262 - cluster/prob_snapshot/cluster_34:0.015594540747250118 - cluster/prob_snapshot/cluster_35:0.02108653118432516 - cluster/prob_snapshot/cluster_36:0.010889939013209058 - cluster/prob_snapshot/cluster_37:0.01701839011982513 - cluster/prob_snapshot/cluster_38:0.015594540747250118 - cluster/prob_snapshot/cluster_39:0.018015084680627635 - cluster/prob_snapshot/cluster_40:0.013560470215000104 - cluster/prob_snapshot/cluster_41:0.018015084680627635 - cluster/prob_snapshot/cluster_42:0.010102550310175077 - cluster/prob_snapshot/cluster_43:0.013560470215000104 - cluster/prob_snapshot/cluster_44:0.010102550310175077 - cluster/prob_snapshot/cluster_45:0.011526399682750089 - cluster/prob_snapshot/cluster_46:0.02204884995313264 - cluster/prob_snapshot/cluster_47:0.01701839011982513 - cluster/prob_snapshot/cluster_48:0.01701839011982513 - cluster/prob_snapshot/cluster_49:0.018015084680627635 - cluster/prob_snapshot/cluster_50:0.010102550310175077 - cluster/prob_snapshot/cluster_51:0.011526399682750089 - cluster/prob_snapshot/cluster_52:0.013560470215000104 - cluster/prob_snapshot/cluster_53:0.013560470215000104 - cluster/prob_snapshot/cluster_54:0.015594540747250118 - cluster/prob_snapshot/cluster_55:0.01701839011982513 - cluster/prob_snapshot/cluster_56:0.015594540747250118 - cluster/prob_snapshot/cluster_57:0.013560470215000104 - cluster/prob_snapshot/cluster_58:0.01701839011982513 - cluster/prob_snapshot/cluster_59:0.015594540747250118 - cluster/prob_snapshot/cluster_60:0.010102550310175077 - cluster/prob_snapshot/cluster_61:0.013560470215000104 - cluster/prob_snapshot/cluster_62:0.01701839011982513 - cluster/prob_snapshot/cluster_63:0.014170691374675108
[36m(TaskRunner pid=2823680)[0m Training Progress:  29%|██▉       | 232/800 [7:30:05<18:30:00, 117.25s/it]
[36m(TaskRunner pid=2823680)[0m step:232 - global_seqlen/min:313386 - global_seqlen/max:408575 - global_seqlen/minmax_diff:95189 - global_seqlen/balanced_min:365932 - global_seqlen/balanced_max:366121 - global_seqlen/mean:366023.75 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.20814864216920206) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012350245378911495 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.18912083042960148) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006880749291061269) - actor/ppo_kl:np.float64(-2.0883163897473887e-05) - actor/pg_clipfrac_lower:np.float64(7.898137411365157e-06) - actor/grad_norm:np.float64(0.2551667930988165) - perf/mfu/actor:np.float64(0.20150011505878304) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.34940338134766) - actor/lr:np.float64(1e-06) - training/global_step:232 - training/epoch:0 - critic/score/mean:0.6188119053840637 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6092686057090759 - critic/rewards/max:1.0113658905029297 - critic/rewards/min:-0.05848833918571472 - critic/advantages/mean:-0.11840402334928513 - critic/advantages/max:2.4747989177703857 - critic/advantages/min:-2.474820613861084 - critic/returns/mean:-0.11840402334928513 - critic/returns/max:2.4747989177703857 - critic/returns/min:-2.474820613861084 - response_length/mean:1149.272216796875 - response_length/max:8192.0 - response_length/min:160.0 - response_length/clip_ratio:0.01608910970389843 - response_length_non_aborted/mean:1149.272216796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:160.0 - response_length_non_aborted/clip_ratio:0.01608910970389843 - response/aborted_ratio:0.0 - prompt_length/mean:239.19801330566406 - prompt_length/max:527.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.20342281460762e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2557124318554997) - timing_s/agent_loop/generate_sequences/max:np.float64(27.76420085504651) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.5546246873063865) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.76420085504651) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.670266821049154 - timing_s/reward:0.00018671806901693344 - timing_s/old_log_prob:10.572485087439418 - timing_s/ref:20.023495559580624 - timing_s/adv:0.08930022083222866 - timing_s/update_actor:23.88870286848396 - timing_s/update_weights:27.04362003877759 - timing_s/step:111.69739517383277 - timing_s/stop_profile:5.0319358706474304e-05 - timing_per_token_ms/adv:7.959844407463575e-05 - timing_per_token_ms/update_actor:0.021293380481835877 - timing_per_token_ms/gen:0.03195119901643437 - timing_per_token_ms/ref:0.017848097984801123 - perf/total_num_tokens:1464095 - perf/time_per_step:111.69739517383277 - perf/throughput:3276.922881060596 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:290.0 - frontier/mean_score:2.346802671875 - frontier/mean_frontier_pct:0.13440930775486443 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.5509999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.3 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.91 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:2.8319299999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.51 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.9656999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.7598999999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:3.3856999999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.6769999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:2.0569999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9540999999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:3.0538999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.49 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.3629999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:1.91 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.11 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.7598999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.49 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.2519299999999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.09 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.91 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.51 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.3 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.09 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:232.0 - cluster/prob_snapshot/cluster_0:0.023642539556029324 - cluster/prob_snapshot/cluster_1:0.015313388053750338 - cluster/prob_snapshot/cluster_2:0.011318591170163294 - cluster/prob_snapshot/cluster_3:0.015313388053750338 - cluster/prob_snapshot/cluster_4:0.013915209144494872 - cluster/prob_snapshot/cluster_5:0.015313388053750338 - cluster/prob_snapshot/cluster_6:0.012716770079418759 - cluster/prob_snapshot/cluster_7:0.018854975230894428 - cluster/prob_snapshot/cluster_8:0.017690292199484627 - cluster/prob_snapshot/cluster_9:0.015313388053750338 - cluster/prob_snapshot/cluster_10:0.02350272166510378 - cluster/prob_snapshot/cluster_11:0.016711566963005804 - cluster/prob_snapshot/cluster_12:0.019745615196090163 - cluster/prob_snapshot/cluster_13:0.018854975230894428 - cluster/prob_snapshot/cluster_14:0.018375399865019806 - cluster/prob_snapshot/cluster_15:0.015313388053750338 - cluster/prob_snapshot/cluster_16:0.018375399865019806 - cluster/prob_snapshot/cluster_17:0.016711566963005804 - cluster/prob_snapshot/cluster_18:0.015313388053750338 - cluster/prob_snapshot/cluster_19:0.015313388053750338 - cluster/prob_snapshot/cluster_20:0.022541973014601095 - cluster/prob_snapshot/cluster_21:0.017690292199484627 - cluster/prob_snapshot/cluster_22:0.011318591170163294 - cluster/prob_snapshot/cluster_23:0.017233227514049015 - cluster/prob_snapshot/cluster_24:0.024481446901582604 - cluster/prob_snapshot/cluster_25:0.009920412260907828 - cluster/prob_snapshot/cluster_26:0.013315989611956817 - cluster/prob_snapshot/cluster_27:0.013695495315897583 - cluster/prob_snapshot/cluster_28:0.015732841726526975 - cluster/prob_snapshot/cluster_29:0.009920412260907828 - cluster/prob_snapshot/cluster_30:0.013010387650362406 - cluster/prob_snapshot/cluster_31:0.020332850337977458 - cluster/prob_snapshot/cluster_32:0.009920412260907828 - cluster/prob_snapshot/cluster_33:0.015732841726526975 - cluster/prob_snapshot/cluster_34:0.012716770079418759 - cluster/prob_snapshot/cluster_35:0.02070636384659285 - cluster/prob_snapshot/cluster_36:0.010693605197726101 - cluster/prob_snapshot/cluster_37:0.016711566963005804 - cluster/prob_snapshot/cluster_38:0.02070636384659285 - cluster/prob_snapshot/cluster_39:0.018375399865019806 - cluster/prob_snapshot/cluster_40:0.013315989611956817 - cluster/prob_snapshot/cluster_41:0.017690292199484627 - cluster/prob_snapshot/cluster_42:0.009920412260907828 - cluster/prob_snapshot/cluster_43:0.013315989611956817 - cluster/prob_snapshot/cluster_44:0.009920412260907828 - cluster/prob_snapshot/cluster_45:0.011318591170163294 - cluster/prob_snapshot/cluster_46:0.02165133304940536 - cluster/prob_snapshot/cluster_47:0.016711566963005804 - cluster/prob_snapshot/cluster_48:0.016711566963005804 - cluster/prob_snapshot/cluster_49:0.017690292199484627 - cluster/prob_snapshot/cluster_50:0.009920412260907828 - cluster/prob_snapshot/cluster_51:0.013915209144494872 - cluster/prob_snapshot/cluster_52:0.013315989611956817 - cluster/prob_snapshot/cluster_53:0.013315989611956817 - cluster/prob_snapshot/cluster_54:0.012716770079418759 - cluster/prob_snapshot/cluster_55:0.016711566963005804 - cluster/prob_snapshot/cluster_56:0.015313388053750338 - cluster/prob_snapshot/cluster_57:0.015313388053750338 - cluster/prob_snapshot/cluster_58:0.017690292199484627 - cluster/prob_snapshot/cluster_59:0.015313388053750338 - cluster/prob_snapshot/cluster_60:0.009920412260907828 - cluster/prob_snapshot/cluster_61:0.015313388053750338 - cluster/prob_snapshot/cluster_62:0.016711566963005804 - cluster/prob_snapshot/cluster_63:0.013915209144494872
[36m(TaskRunner pid=2823680)[0m Training Progress:  29%|██▉       | 233/800 [7:32:01<18:23:13, 116.74s/it]
[36m(TaskRunner pid=2823680)[0m step:233 - global_seqlen/min:344693 - global_seqlen/max:428650 - global_seqlen/minmax_diff:83957 - global_seqlen/balanced_min:401936 - global_seqlen/balanced_max:402234 - global_seqlen/mean:402101.0 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.19175262844228014) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013141758739948273 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.024601648503448814) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004726999792851251) - actor/ppo_kl:np.float64(3.155381292270583e-05) - actor/pg_clipfrac_lower:np.float64(6.412779097148331e-06) - actor/grad_norm:np.float64(0.24310406469381773) - perf/mfu/actor:np.float64(0.21623644291815367) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.72195816040039) - actor/lr:np.float64(1e-06) - training/global_step:233 - training/epoch:0 - critic/score/mean:0.5599489808082581 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5504496097564697 - critic/rewards/max:1.0500288009643555 - critic/rewards/min:-0.15198957920074463 - critic/advantages/mean:-0.12722042202949524 - critic/advantages/max:2.474822998046875 - critic/advantages/min:-2.47481632232666 - critic/returns/mean:-0.12722042202949524 - critic/returns/max:2.474822998046875 - critic/returns/min:-2.47481632232666 - response_length/mean:1247.7818603515625 - response_length/max:8192.0 - response_length/min:204.0 - response_length/clip_ratio:0.021683674305677414 - response_length_non_aborted/mean:1247.7818603515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:204.0 - response_length_non_aborted/clip_ratio:0.021683674305677414 - response/aborted_ratio:0.0 - prompt_length/mean:243.34693908691406 - prompt_length/max:383.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.783582597970963e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7494858363643289) - timing_s/agent_loop/generate_sequences/max:np.float64(30.28289254847914) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.208430370606038) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.28289254847914) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.91383597627282 - timing_s/reward:0.00018082838505506516 - timing_s/old_log_prob:10.208484152331948 - timing_s/ref:21.85786162596196 - timing_s/adv:0.08293394558131695 - timing_s/update_actor:21.971693029627204 - timing_s/update_weights:28.836244880221784 - timing_s/step:115.34074825048447 - timing_s/stop_profile:5.484279245138168e-05 - timing_per_token_ms/adv:7.094161951106839e-05 - timing_per_token_ms/update_actor:0.01879456567508283 - timing_per_token_ms/gen:0.032623027981564044 - timing_per_token_ms/ref:0.01869719439881438 - perf/total_num_tokens:1608404 - perf/time_per_step:115.34074825048447 - perf/throughput:3486.2007235011242 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:320.0 - frontier/mean_score:2.36130003125 - frontier/mean_frontier_pct:0.14967374669036965 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.5509999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.51 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.09 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.91 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.8823509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.3709999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.51 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.9759899999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.8319299999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.0569999999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.2699899999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.6769999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:1.9540999999999997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:3.0538999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.9429999999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.3629999999999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:1.60613 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.11 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.7598999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.9429999999999998 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.2519299999999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.09 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:1.91 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.7598999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.3 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.09 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:233.0 - cluster/prob_snapshot/cluster_0:0.023497384604119226 - cluster/prob_snapshot/cluster_1:0.01660896518060807 - cluster/prob_snapshot/cluster_2:0.011249099923121005 - cluster/prob_snapshot/cluster_3:0.015219370484222535 - cluster/prob_snapshot/cluster_4:0.013829775787837 - cluster/prob_snapshot/cluster_5:0.015219370484222535 - cluster/prob_snapshot/cluster_6:0.01263869461950654 - cluster/prob_snapshot/cluster_7:0.01907285553676926 - cluster/prob_snapshot/cluster_8:0.017581681468077943 - cluster/prob_snapshot/cluster_9:0.015219370484222535 - cluster/prob_snapshot/cluster_10:0.022306303435788766 - cluster/prob_snapshot/cluster_11:0.01660896518060807 - cluster/prob_snapshot/cluster_12:0.019692475811887572 - cluster/prob_snapshot/cluster_13:0.018739213850167094 - cluster/prob_snapshot/cluster_14:0.018262582869306855 - cluster/prob_snapshot/cluster_15:0.015219370484222535 - cluster/prob_snapshot/cluster_16:0.018739213850167094 - cluster/prob_snapshot/cluster_17:0.013611410906976413 - cluster/prob_snapshot/cluster_18:0.015219370484222535 - cluster/prob_snapshot/cluster_19:0.015219370484222535 - cluster/prob_snapshot/cluster_20:0.021637908386827324 - cluster/prob_snapshot/cluster_21:0.017581681468077943 - cluster/prob_snapshot/cluster_22:0.011249099923121005 - cluster/prob_snapshot/cluster_23:0.01712742296182951 - cluster/prob_snapshot/cluster_24:0.024331141421950548 - cluster/prob_snapshot/cluster_25:0.009859505226735468 - cluster/prob_snapshot/cluster_26:0.01323423520367177 - cluster/prob_snapshot/cluster_27:0.011513122915434253 - cluster/prob_snapshot/cluster_28:0.015636248893138194 - cluster/prob_snapshot/cluster_29:0.009859505226735468 - cluster/prob_snapshot/cluster_30:0.012930509505747501 - cluster/prob_snapshot/cluster_31:0.020208015444246608 - cluster/prob_snapshot/cluster_32:0.012857059500367123 - cluster/prob_snapshot/cluster_33:0.015636248893138194 - cluster/prob_snapshot/cluster_34:0.014802492075306876 - cluster/prob_snapshot/cluster_35:0.0205792357417096 - cluster/prob_snapshot/cluster_36:0.010627951093836671 - cluster/prob_snapshot/cluster_37:0.01660896518060807 - cluster/prob_snapshot/cluster_38:0.0205792357417096 - cluster/prob_snapshot/cluster_39:0.018262582869306855 - cluster/prob_snapshot/cluster_40:0.01323423520367177 - cluster/prob_snapshot/cluster_41:0.017581681468077943 - cluster/prob_snapshot/cluster_42:0.009859505226735468 - cluster/prob_snapshot/cluster_43:0.01323423520367177 - cluster/prob_snapshot/cluster_44:0.012857059500367123 - cluster/prob_snapshot/cluster_45:0.013829775787837 - cluster/prob_snapshot/cluster_46:0.021518403242938165 - cluster/prob_snapshot/cluster_47:0.01660896518060807 - cluster/prob_snapshot/cluster_48:0.01660896518060807 - cluster/prob_snapshot/cluster_49:0.017581681468077943 - cluster/prob_snapshot/cluster_50:0.009859505226735468 - cluster/prob_snapshot/cluster_51:0.013829775787837 - cluster/prob_snapshot/cluster_52:0.01323423520367177 - cluster/prob_snapshot/cluster_53:0.01323423520367177 - cluster/prob_snapshot/cluster_54:0.01263869461950654 - cluster/prob_snapshot/cluster_55:0.013611410906976413 - cluster/prob_snapshot/cluster_56:0.015219370484222535 - cluster/prob_snapshot/cluster_57:0.015219370484222535 - cluster/prob_snapshot/cluster_58:0.018262582869306855 - cluster/prob_snapshot/cluster_59:0.01660896518060807 - cluster/prob_snapshot/cluster_60:0.009859505226735468 - cluster/prob_snapshot/cluster_61:0.015219370484222535 - cluster/prob_snapshot/cluster_62:0.017581681468077943 - cluster/prob_snapshot/cluster_63:0.013829775787837
[36m(TaskRunner pid=2823680)[0m Training Progress:  29%|██▉       | 234/800 [7:33:51<18:03:50, 114.89s/it]
[36m(TaskRunner pid=2823680)[0m step:234 - global_seqlen/min:319504 - global_seqlen/max:464615 - global_seqlen/minmax_diff:145111 - global_seqlen/balanced_min:391767 - global_seqlen/balanced_max:391842 - global_seqlen/mean:391790.25 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.1819575458993109) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013771784491837025 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.0503641782925115) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005578426002733866) - actor/ppo_kl:np.float64(7.031362874249217e-05) - actor/pg_clipfrac_lower:np.float64(6.085953939372284e-06) - actor/grad_norm:np.float64(0.23985089132419) - perf/mfu/actor:np.float64(0.2008767902167569) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(103.8740119934082) - actor/lr:np.float64(1e-06) - training/global_step:234 - training/epoch:0 - critic/score/mean:0.626288652420044 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6178690195083618 - critic/rewards/max:1.2435730695724487 - critic/rewards/min:-0.10218284279108047 - critic/advantages/mean:-0.12565556168556213 - critic/advantages/max:2.474820375442505 - critic/advantages/min:-2.4748525619506836 - critic/returns/mean:-0.12565556168556213 - critic/returns/max:2.474820375442505 - critic/returns/min:-2.4748525619506836 - response_length/mean:1154.8272705078125 - response_length/max:8192.0 - response_length/min:195.0 - response_length/clip_ratio:0.0167525764554739 - response_length_non_aborted/mean:1154.8272705078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:195.0 - response_length_non_aborted/clip_ratio:0.0167525764554739 - response/aborted_ratio:0.0 - prompt_length/mean:234.18556213378906 - prompt_length/max:350.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.417293429374695e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5026815263554454) - timing_s/agent_loop/generate_sequences/max:np.float64(28.88525260705501) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.886119068934022) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.88525260705501) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.808881178498268 - timing_s/reward:0.00017343182116746902 - timing_s/old_log_prob:10.513227832503617 - timing_s/ref:19.313135999254882 - timing_s/adv:0.09681195672601461 - timing_s/update_actor:22.89487500395626 - timing_s/update_weights:26.195303566753864 - timing_s/step:110.3100008694455 - timing_s/stop_profile:6.055552512407303e-05 - timing_per_token_ms/adv:8.981750810021821e-05 - timing_per_token_ms/update_actor:0.021240771188428573 - timing_per_token_ms/gen:0.034379310043785576 - timing_per_token_ms/ref:0.01791780486332807 - perf/total_num_tokens:1567161 - perf/time_per_step:110.3100008694455 - perf/throughput:3551.720124303988 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:351.0 - frontier/mean_score:2.4097919523809526 - frontier/mean_frontier_pct:0.147379698518207 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.5509999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.6569999999999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.3629999999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.51 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.91 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.8823509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.8596999999999997 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.6569999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.9759899999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.8319299999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.8319299999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.0569999999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.2699899999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.4738999999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.2678699999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:3.0538999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.9429999999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.5540999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.6569999999999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.11 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:2.6569999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:1.9429999999999998 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.2519299999999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.09 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.237 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.8319299999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.3 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.09 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:234.0 - cluster/prob_snapshot/cluster_0:0.023390018922334284 - cluster/prob_snapshot/cluster_1:0.017501346177595658 - cluster/prob_snapshot/cluster_2:0.009814454574564372 - cluster/prob_snapshot/cluster_3:0.015149829209059096 - cluster/prob_snapshot/cluster_4:0.015564802791742018 - cluster/prob_snapshot/cluster_5:0.01653307448466884 - cluster/prob_snapshot/cluster_6:0.012580945125783859 - cluster/prob_snapshot/cluster_7:0.018985706682852474 - cluster/prob_snapshot/cluster_8:0.017501346177595658 - cluster/prob_snapshot/cluster_9:0.015149829209059096 - cluster/prob_snapshot/cluster_10:0.025423389477480605 - cluster/prob_snapshot/cluster_11:0.017501346177595658 - cluster/prob_snapshot/cluster_12:0.01960249575124686 - cluster/prob_snapshot/cluster_13:0.018653589492178574 - cluster/prob_snapshot/cluster_14:0.018179136362644434 - cluster/prob_snapshot/cluster_15:0.015149829209059096 - cluster/prob_snapshot/cluster_16:0.018653589492178574 - cluster/prob_snapshot/cluster_17:0.013549216818710677 - cluster/prob_snapshot/cluster_18:0.015149829209059096 - cluster/prob_snapshot/cluster_19:0.015149829209059096 - cluster/prob_snapshot/cluster_20:0.0215390391371005 - cluster/prob_snapshot/cluster_21:0.017501346177595658 - cluster/prob_snapshot/cluster_22:0.009814454574564372 - cluster/prob_snapshot/cluster_23:0.017049163296998834 - cluster/prob_snapshot/cluster_24:0.022882170299717564 - cluster/prob_snapshot/cluster_25:0.009814454574564372 - cluster/prob_snapshot/cluster_26:0.013173764529616607 - cluster/prob_snapshot/cluster_27:0.011460516452539964 - cluster/prob_snapshot/cluster_28:0.015564802791742018 - cluster/prob_snapshot/cluster_29:0.009814454574564372 - cluster/prob_snapshot/cluster_30:0.014938192681890804 - cluster/prob_snapshot/cluster_31:0.020115679748498073 - cluster/prob_snapshot/cluster_32:0.012798312240522532 - cluster/prob_snapshot/cluster_33:0.016823555992546885 - cluster/prob_snapshot/cluster_34:0.014734855626376175 - cluster/prob_snapshot/cluster_35:0.02048520384355382 - cluster/prob_snapshot/cluster_36:0.013333766486711064 - cluster/prob_snapshot/cluster_37:0.017501346177595658 - cluster/prob_snapshot/cluster_38:0.02048520384355382 - cluster/prob_snapshot/cluster_39:0.018653589492178574 - cluster/prob_snapshot/cluster_40:0.013173764529616607 - cluster/prob_snapshot/cluster_41:0.017501346177595658 - cluster/prob_snapshot/cluster_42:0.009814454574564372 - cluster/prob_snapshot/cluster_43:0.013173764529616607 - cluster/prob_snapshot/cluster_44:0.012798312240522532 - cluster/prob_snapshot/cluster_45:0.013766583933449353 - cluster/prob_snapshot/cluster_46:0.02142008004339806 - cluster/prob_snapshot/cluster_47:0.01653307448466884 - cluster/prob_snapshot/cluster_48:0.01653307448466884 - cluster/prob_snapshot/cluster_49:0.017501346177595658 - cluster/prob_snapshot/cluster_50:0.009814454574564372 - cluster/prob_snapshot/cluster_51:0.013766583933449353 - cluster/prob_snapshot/cluster_52:0.013173764529616607 - cluster/prob_snapshot/cluster_53:0.013173764529616607 - cluster/prob_snapshot/cluster_54:0.014734855626376175 - cluster/prob_snapshot/cluster_55:0.013549216818710677 - cluster/prob_snapshot/cluster_56:0.015149829209059096 - cluster/prob_snapshot/cluster_57:0.015149829209059096 - cluster/prob_snapshot/cluster_58:0.018653589492178574 - cluster/prob_snapshot/cluster_59:0.01653307448466884 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.015149829209059096 - cluster/prob_snapshot/cluster_62:0.017501346177595658 - cluster/prob_snapshot/cluster_63:0.013766583933449353
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 19:06:17,830:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  29%|██▉       | 235/800 [7:35:58<18:35:43, 118.48s/it]
[36m(TaskRunner pid=2823680)[0m step:235 - global_seqlen/min:404331 - global_seqlen/max:463790 - global_seqlen/minmax_diff:59459 - global_seqlen/balanced_min:430195 - global_seqlen/balanced_max:430261 - global_seqlen/mean:430227.75 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.18501775733571738) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011858141981065273 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.027346306858817115) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006080428860552777) - actor/ppo_kl:np.float64(0.001580956090681363) - actor/pg_clipfrac_lower:np.float64(9.65057795915624e-06) - actor/grad_norm:np.float64(0.25706727678577107) - perf/mfu/actor:np.float64(0.227298124741035) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.2578010559082) - actor/lr:np.float64(1e-06) - training/global_step:235 - training/epoch:0 - critic/score/mean:0.602150559425354 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5974202156066895 - critic/rewards/max:1.098076581954956 - critic/rewards/min:-0.09340226650238037 - critic/advantages/mean:-0.19613049924373627 - critic/advantages/max:2.474740505218506 - critic/advantages/min:-2.4748361110687256 - critic/returns/mean:-0.19613049924373627 - critic/returns/max:2.474740505218506 - critic/returns/min:-2.4748361110687256 - response_length/mean:1386.997314453125 - response_length/max:8192.0 - response_length/min:91.0 - response_length/clip_ratio:0.037634409964084625 - response_length_non_aborted/mean:1386.997314453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:91.0 - response_length_non_aborted/clip_ratio:0.037634409964084625 - response/aborted_ratio:0.0 - prompt_length/mean:236.93548583984375 - prompt_length/max:375.0 - prompt_length/min:186.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.811576455831528e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8051033355295658) - timing_s/agent_loop/generate_sequences/max:np.float64(33.199679320678115) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.4510731274504) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.199679320678115) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.7203584080562 - timing_s/reward:0.0001509580761194229 - timing_s/old_log_prob:12.150730408728123 - timing_s/ref:24.597288442775607 - timing_s/adv:0.10558605473488569 - timing_s/update_actor:22.825170991942286 - timing_s/update_weights:30.822703883051872 - timing_s/step:126.6430721655488 - timing_s/stop_profile:6.51450827717781e-05 - timing_per_token_ms/adv:8.739077171846995e-05 - timing_per_token_ms/update_actor:0.01889178748652323 - timing_per_token_ms/gen:0.03461523249540781 - timing_per_token_ms/ref:0.020358522009306036 - perf/total_num_tokens:1720911 - perf/time_per_step:126.6430721655488 - perf/throughput:3397.167666918274 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:386.0 - frontier/mean_score:2.437592714285715 - frontier/mean_frontier_pct:0.1651310620366375 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.5509999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.1598999999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.5540999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.51 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.91 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.8823509999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.6569999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.8596999999999997 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.9831929999999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.8823509999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:2.7598999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.8823509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.0569999999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.1889929999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.9317299999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.2678699999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:3.0538999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.9429999999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.6878699999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.11 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.6569999999999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.0769999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.7598999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.3 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.2600999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.2519299999999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.6569999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.09 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.0659 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:2.8319299999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.3 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.763 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:235.0 - cluster/prob_snapshot/cluster_0:0.023123255593416867 - cluster/prob_snapshot/cluster_1:0.01406474788967082 - cluster/prob_snapshot/cluster_2:0.00970252065170125 - cluster/prob_snapshot/cluster_3:0.014977045301283807 - cluster/prob_snapshot/cluster_4:0.016631683219134333 - cluster/prob_snapshot/cluster_5:0.016344514654879284 - cluster/prob_snapshot/cluster_6:0.012437459358892206 - cluster/prob_snapshot/cluster_7:0.018769174565739424 - cluster/prob_snapshot/cluster_8:0.01730174320239612 - cluster/prob_snapshot/cluster_9:0.014977045301283807 - cluster/prob_snapshot/cluster_10:0.025133435543202222 - cluster/prob_snapshot/cluster_11:0.0179718031856579 - cluster/prob_snapshot/cluster_12:0.019425833349335973 - cluster/prob_snapshot/cluster_13:0.018769174565739424 - cluster/prob_snapshot/cluster_14:0.0179718031856579 - cluster/prob_snapshot/cluster_15:0.014977045301283807 - cluster/prob_snapshot/cluster_16:0.018769174565739424 - cluster/prob_snapshot/cluster_17:0.013394687906409037 - cluster/prob_snapshot/cluster_18:0.014977045301283807 - cluster/prob_snapshot/cluster_19:0.014977045301283807 - cluster/prob_snapshot/cluster_20:0.020765953315859544 - cluster/prob_snapshot/cluster_21:0.01730174320239612 - cluster/prob_snapshot/cluster_22:0.00970252065170125 - cluster/prob_snapshot/cluster_23:0.016854717470705754 - cluster/prob_snapshot/cluster_24:0.02560247753148547 - cluster/prob_snapshot/cluster_25:0.00970252065170125 - cluster/prob_snapshot/cluster_26:0.013023517653290267 - cluster/prob_snapshot/cluster_27:0.011329809182479866 - cluster/prob_snapshot/cluster_28:0.015387286107362449 - cluster/prob_snapshot/cluster_29:0.00970252065170125 - cluster/prob_snapshot/cluster_30:0.014767822490183696 - cluster/prob_snapshot/cluster_31:0.01988626028069157 - cluster/prob_snapshot/cluster_32:0.012652347400171494 - cluster/prob_snapshot/cluster_33:0.01750276119737465 - cluster/prob_snapshot/cluster_34:0.014566804495205164 - cluster/prob_snapshot/cluster_35:0.020251569950866365 - cluster/prob_snapshot/cluster_36:0.013181694786948304 - cluster/prob_snapshot/cluster_37:0.01730174320239612 - cluster/prob_snapshot/cluster_38:0.020036681909587073 - cluster/prob_snapshot/cluster_39:0.01844084517394115 - cluster/prob_snapshot/cluster_40:0.013023517653290267 - cluster/prob_snapshot/cluster_41:0.0179718031856579 - cluster/prob_snapshot/cluster_42:0.00970252065170125 - cluster/prob_snapshot/cluster_43:0.014977045301283807 - cluster/prob_snapshot/cluster_44:0.014717226124100664 - cluster/prob_snapshot/cluster_45:0.01360957594768833 - cluster/prob_snapshot/cluster_46:0.021175783881132105 - cluster/prob_snapshot/cluster_47:0.01730174320239612 - cluster/prob_snapshot/cluster_48:0.016344514654879284 - cluster/prob_snapshot/cluster_49:0.01730174320239612 - cluster/prob_snapshot/cluster_50:0.00970252065170125 - cluster/prob_snapshot/cluster_51:0.01360957594768833 - cluster/prob_snapshot/cluster_52:0.013023517653290267 - cluster/prob_snapshot/cluster_53:0.013023517653290267 - cluster/prob_snapshot/cluster_54:0.019964401386611316 - cluster/prob_snapshot/cluster_55:0.013394687906409037 - cluster/prob_snapshot/cluster_56:0.014977045301283807 - cluster/prob_snapshot/cluster_57:0.014977045301283807 - cluster/prob_snapshot/cluster_58:0.01844084517394115 - cluster/prob_snapshot/cluster_59:0.016344514654879284 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.014977045301283807 - cluster/prob_snapshot/cluster_62:0.01730174320239612 - cluster/prob_snapshot/cluster_63:0.01148023081137537
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 19:08:21,492:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|██▉       | 236/800 [7:37:47<18:05:42, 115.50s/it]
[36m(TaskRunner pid=2823680)[0m step:236 - global_seqlen/min:347651 - global_seqlen/max:468863 - global_seqlen/minmax_diff:121212 - global_seqlen/balanced_min:389380 - global_seqlen/balanced_max:389580 - global_seqlen/mean:389502.0 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.20112377575234228) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010018108412623405 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03454871151188854) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004622445752750576) - actor/ppo_kl:np.float64(-0.0004026822027284375) - actor/pg_clipfrac_lower:np.float64(2.3547199757360985e-05) - actor/grad_norm:np.float64(0.27608299752076465) - perf/mfu/actor:np.float64(0.2343133158896285) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.60658264160156) - actor/lr:np.float64(1e-06) - training/global_step:236 - training/epoch:0 - critic/score/mean:0.6030219793319702 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5979341864585876 - critic/rewards/max:1.0477690696716309 - critic/rewards/min:-0.05648432672023773 - critic/advantages/mean:-0.11592979729175568 - critic/advantages/max:2.474771022796631 - critic/advantages/min:-2.4748268127441406 - critic/returns/mean:-0.11592979729175568 - critic/returns/max:2.474771022796631 - critic/returns/min:-2.4748268127441406 - response_length/mean:1173.6043701171875 - response_length/max:8192.0 - response_length/min:220.0 - response_length/clip_ratio:0.012362637557089329 - response_length_non_aborted/mean:1173.6043701171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:220.0 - response_length_non_aborted/clip_ratio:0.012362637557089329 - response/aborted_ratio:0.0 - prompt_length/mean:246.62637329101562 - prompt_length/max:461.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.423067629337311e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3230615155771375) - timing_s/agent_loop/generate_sequences/max:np.float64(29.339723331853747) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.899974244348414) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.339723331853747) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:205 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.642416610382497 - timing_s/reward:0.0001338636502623558 - timing_s/old_log_prob:9.751339736394584 - timing_s/ref:20.522487729787827 - timing_s/adv:0.0664522610604763 - timing_s/update_actor:19.529410884715617 - timing_s/update_weights:27.402247617952526 - timing_s/step:108.31083629094064 - timing_s/stop_profile:5.61252236366272e-05 - timing_per_token_ms/adv:6.42716524366071e-05 - timing_per_token_ms/update_actor:0.018888559826908274 - timing_per_token_ms/gen:0.03586492327850533 - timing_per_token_ms/ref:0.019849049188906603 - perf/total_num_tokens:1558008 - perf/time_per_step:108.31083629094064 - perf/throughput:3596.149871410224 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:423.0 - frontier/mean_score:2.431028452380953 - frontier/mean_frontier_pct:0.1783278740968595 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.5509999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.1598999999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.5540999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.51 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.91 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:2.9176456999999996 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.7598999999999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.91 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.8596999999999997 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9882350999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.8823509999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:3.4319299999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.9176456999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.1889929999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.9317299999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:1.7398999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.2678699999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:3.0538999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:1.9429999999999998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.6569999999999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.0769999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.2600999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:3.2519299999999993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.0569999999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.09 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.0659 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.6569999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.3 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.763 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:236.0 - cluster/prob_snapshot/cluster_0:0.023185692997494668 - cluster/prob_snapshot/cluster_1:0.014102725515429097 - cluster/prob_snapshot/cluster_2:0.009728719393485514 - cluster/prob_snapshot/cluster_3:0.015017486312091731 - cluster/prob_snapshot/cluster_4:0.016676592082484124 - cluster/prob_snapshot/cluster_5:0.016388648105804454 - cluster/prob_snapshot/cluster_6:0.012471042980910958 - cluster/prob_snapshot/cluster_7:0.01905030624490578 - cluster/prob_snapshot/cluster_8:0.01802033064032259 - cluster/prob_snapshot/cluster_9:0.012471042980910958 - cluster/prob_snapshot/cluster_10:0.02520130083425237 - cluster/prob_snapshot/cluster_11:0.01802033064032259 - cluster/prob_snapshot/cluster_12:0.01951120857024437 - cluster/prob_snapshot/cluster_13:0.01881985508223648 - cluster/prob_snapshot/cluster_14:0.02240824426045955 - cluster/prob_snapshot/cluster_15:0.015017486312091731 - cluster/prob_snapshot/cluster_16:0.01905030624490578 - cluster/prob_snapshot/cluster_17:0.015278007052897143 - cluster/prob_snapshot/cluster_18:0.015017486312091731 - cluster/prob_snapshot/cluster_19:0.015017486312091731 - cluster/prob_snapshot/cluster_20:0.0208220255334158 - cluster/prob_snapshot/cluster_21:0.017348461361403358 - cluster/prob_snapshot/cluster_22:0.009728719393485514 - cluster/prob_snapshot/cluster_23:0.016900228571038668 - cluster/prob_snapshot/cluster_24:0.02567160932949583 - cluster/prob_snapshot/cluster_25:0.008768906137886607 - cluster/prob_snapshot/cluster_26:0.013058683749644985 - cluster/prob_snapshot/cluster_27:0.01136040192800365 - cluster/prob_snapshot/cluster_28:0.015428834850205546 - cluster/prob_snapshot/cluster_29:0.009728719393485514 - cluster/prob_snapshot/cluster_30:0.014807698557653681 - cluster/prob_snapshot/cluster_31:0.019939957151520405 - cluster/prob_snapshot/cluster_32:0.0126865112627801 - cluster/prob_snapshot/cluster_33:0.014243818064002135 - cluster/prob_snapshot/cluster_34:0.014606137773977915 - cluster/prob_snapshot/cluster_35:0.020090784948828804 - cluster/prob_snapshot/cluster_36:0.013217287993126296 - cluster/prob_snapshot/cluster_37:0.017348461361403358 - cluster/prob_snapshot/cluster_38:0.020090784948828804 - cluster/prob_snapshot/cluster_39:0.018490639135566055 - cluster/prob_snapshot/cluster_40:0.013058683749644985 - cluster/prob_snapshot/cluster_41:0.018490639135566055 - cluster/prob_snapshot/cluster_42:0.009728719393485514 - cluster/prob_snapshot/cluster_43:0.016388648105804454 - cluster/prob_snapshot/cluster_44:0.01475696557128631 - cluster/prob_snapshot/cluster_45:0.013646324518379008 - cluster/prob_snapshot/cluster_46:0.0212329627229915 - cluster/prob_snapshot/cluster_47:0.01802033064032259 - cluster/prob_snapshot/cluster_48:0.013430856236509863 - cluster/prob_snapshot/cluster_49:0.017348461361403358 - cluster/prob_snapshot/cluster_50:0.009728719393485514 - cluster/prob_snapshot/cluster_51:0.013646324518379008 - cluster/prob_snapshot/cluster_52:0.013058683749644985 - cluster/prob_snapshot/cluster_53:0.013058683749644985 - cluster/prob_snapshot/cluster_54:0.02001830925401828 - cluster/prob_snapshot/cluster_55:0.013430856236509863 - cluster/prob_snapshot/cluster_56:0.015017486312091731 - cluster/prob_snapshot/cluster_57:0.015017486312091731 - cluster/prob_snapshot/cluster_58:0.014902249957342985 - cluster/prob_snapshot/cluster_59:0.017348461361403358 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.015017486312091731 - cluster/prob_snapshot/cluster_62:0.017348461361403358 - cluster/prob_snapshot/cluster_63:0.011511229725312053
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|██▉       | 237/800 [7:39:25<17:16:12, 110.43s/it]
[36m(TaskRunner pid=2823680)[0m step:237 - global_seqlen/min:326651 - global_seqlen/max:361802 - global_seqlen/minmax_diff:35151 - global_seqlen/balanced_min:346011 - global_seqlen/balanced_max:347397 - global_seqlen/mean:346642.5 - frontier/skipped_zero_acc_count:21.0 - actor/entropy:np.float64(0.20028003763959365) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013686033897101879 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.10349482391029596) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0003667396442863349) - actor/ppo_kl:np.float64(6.13161437938827e-05) - actor/pg_clipfrac_lower:np.float64(0.0) - actor/grad_norm:np.float64(0.29129686525889803) - perf/mfu/actor:np.float64(0.15394142297130178) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.93149948120117) - actor/lr:np.float64(1e-06) - training/global_step:237 - training/epoch:0 - critic/score/mean:0.65887850522995 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.649907648563385 - critic/rewards/max:1.0353269577026367 - critic/rewards/min:-0.04049622267484665 - critic/advantages/mean:-0.12880945205688477 - critic/advantages/max:2.4747073650360107 - critic/advantages/min:-2.4748523235321045 - critic/returns/mean:-0.12880945205688477 - critic/returns/max:2.4747073650360107 - critic/returns/min:-2.4748523235321045 - response_length/mean:1021.1436767578125 - response_length/max:8192.0 - response_length/min:169.0 - response_length/clip_ratio:0.011682243086397648 - response_length_non_aborted/mean:1021.1436767578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:169.0 - response_length_non_aborted/clip_ratio:0.011682243086397648 - response/aborted_ratio:0.0 - prompt_length/mean:238.74766540527344 - prompt_length/max:395.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.641183376312256e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7102813897654414) - timing_s/agent_loop/generate_sequences/max:np.float64(27.615057598799467) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.405673438347549) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(27.615057598799467) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:29.137018512934446 - timing_s/reward:0.00029452983289957047 - timing_s/old_log_prob:10.434901249594986 - timing_s/ref:10.653739494271576 - timing_s/adv:0.09948915336281061 - timing_s/update_actor:27.004689888097346 - timing_s/update_weights:20.623647809028625 - timing_s/step:98.3791718846187 - timing_s/stop_profile:6.627105176448822e-05 - timing_per_token_ms/adv:9.225053095070188e-05 - timing_per_token_ms/update_actor:0.025039885214936893 - timing_per_token_ms/gen:0.03333377399234463 - timing_per_token_ms/ref:0.009878595723625828 - perf/total_num_tokens:1386570 - perf/time_per_step:98.3791718846187 - perf/throughput:3523.535453282226 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:444.0 - frontier/mean_score:2.4879290442857145 - frontier/mean_frontier_pct:0.19796031444980977 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:0.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.9856999999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.4119299999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.5540999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:1.91 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.8319299999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.91 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.201789999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9882350999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.8823509999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:3.4319299999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.9176456999999996 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:3.1889929999999995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.9317299999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.1179299999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.2678699999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:3.0377299999999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.2600999999999996 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.7598999999999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.0769999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.2600999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:3.7763509999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.0569999999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.6569999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.09 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.0659 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.7598999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.51 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.763 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:237.0 - cluster/prob_snapshot/cluster_0:0.025428811770330368 - cluster/prob_snapshot/cluster_1:0.015388141097727607 - cluster/prob_snapshot/cluster_2:0.009506217110618524 - cluster/prob_snapshot/cluster_3:0.016013828823927847 - cluster/prob_snapshot/cluster_4:0.01629518733035622 - cluster/prob_snapshot/cluster_5:0.016951690512022426 - cluster/prob_snapshot/cluster_6:0.012185821933745893 - cluster/prob_snapshot/cluster_7:0.018772239485100983 - cluster/prob_snapshot/cluster_8:0.018067745920854975 - cluster/prob_snapshot/cluster_9:0.012185821933745893 - cluster/prob_snapshot/cluster_10:0.026807468451829398 - cluster/prob_snapshot/cluster_11:0.017608193693688632 - cluster/prob_snapshot/cluster_12:0.01906497425380594 - cluster/prob_snapshot/cluster_13:0.018389432479871414 - cluster/prob_snapshot/cluster_14:0.021895752811036933 - cluster/prob_snapshot/cluster_15:0.014674026412364163 - cluster/prob_snapshot/cluster_16:0.018614613071182926 - cluster/prob_snapshot/cluster_17:0.01492858887056126 - cluster/prob_snapshot/cluster_18:0.014674026412364163 - cluster/prob_snapshot/cluster_19:0.016013828823927847 - cluster/prob_snapshot/cluster_20:0.02034581196123671 - cluster/prob_snapshot/cluster_21:0.016951690512022426 - cluster/prob_snapshot/cluster_22:0.009506217110618524 - cluster/prob_snapshot/cluster_23:0.016513709103682255 - cluster/prob_snapshot/cluster_24:0.0250844825505585 - cluster/prob_snapshot/cluster_25:0.008568355422523944 - cluster/prob_snapshot/cluster_26:0.012760022967273188 - cluster/prob_snapshot/cluster_27:0.013512417721538448 - cluster/prob_snapshot/cluster_28:0.015075967135833267 - cluster/prob_snapshot/cluster_29:0.009506217110618524 - cluster/prob_snapshot/cluster_30:0.014469036643394918 - cluster/prob_snapshot/cluster_31:0.019380752284187387 - cluster/prob_snapshot/cluster_32:0.014419463954167062 - cluster/prob_snapshot/cluster_33:0.013918052471656577 - cluster/prob_snapshot/cluster_34:0.014272085688895061 - cluster/prob_snapshot/cluster_35:0.019631295335149797 - cluster/prob_snapshot/cluster_36:0.012914999826222203 - cluster/prob_snapshot/cluster_37:0.017608193693688632 - cluster/prob_snapshot/cluster_38:0.019631295335149797 - cluster/prob_snapshot/cluster_39:0.018067745920854975 - cluster/prob_snapshot/cluster_40:0.012760022967273188 - cluster/prob_snapshot/cluster_41:0.018067745920854975 - cluster/prob_snapshot/cluster_42:0.009506217110618524 - cluster/prob_snapshot/cluster_43:0.016013828823927847 - cluster/prob_snapshot/cluster_44:0.014419463954167062 - cluster/prob_snapshot/cluster_45:0.015075967135833267 - cluster/prob_snapshot/cluster_46:0.02409316274624253 - cluster/prob_snapshot/cluster_47:0.017608193693688632 - cluster/prob_snapshot/cluster_48:0.01312368362184047 - cluster/prob_snapshot/cluster_49:0.016951690512022426 - cluster/prob_snapshot/cluster_50:0.009506217110618524 - cluster/prob_snapshot/cluster_51:0.01333422400080048 - cluster/prob_snapshot/cluster_52:0.012760022967273188 - cluster/prob_snapshot/cluster_53:0.012760022967273188 - cluster/prob_snapshot/cluster_54:0.019560477207681434 - cluster/prob_snapshot/cluster_55:0.01312368362184047 - cluster/prob_snapshot/cluster_56:0.014674026412364163 - cluster/prob_snapshot/cluster_57:0.014674026412364163 - cluster/prob_snapshot/cluster_58:0.014561425589689458 - cluster/prob_snapshot/cluster_59:0.017608193693688632 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.016013828823927847 - cluster/prob_snapshot/cluster_62:0.016951690512022426 - cluster/prob_snapshot/cluster_63:0.011247960245651313
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|██▉       | 238/800 [7:41:24<17:36:15, 112.77s/it]
[36m(TaskRunner pid=2823680)[0m step:238 - global_seqlen/min:325171 - global_seqlen/max:490727 - global_seqlen/minmax_diff:165556 - global_seqlen/balanced_min:400149 - global_seqlen/balanced_max:400263 - global_seqlen/mean:400197.5 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.21432591305703533) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01202037651091814 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0448080295973341) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005516330063389377) - actor/ppo_kl:np.float64(-0.00014639849322687653) - actor/pg_clipfrac_lower:np.float64(2.38498769435029e-05) - actor/grad_norm:np.float64(0.22261243829360375) - perf/mfu/actor:np.float64(0.19874236961282282) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.32020568847656) - actor/lr:np.float64(1e-06) - training/global_step:238 - training/epoch:0 - critic/score/mean:0.6198979616165161 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6105339527130127 - critic/rewards/max:1.0231890678405762 - critic/rewards/min:-0.11002514511346817 - critic/advantages/mean:-0.15458367764949799 - critic/advantages/max:2.474777936935425 - critic/advantages/min:-2.474818468093872 - critic/returns/mean:-0.15458367764949799 - critic/returns/max:2.474777936935425 - critic/returns/min:-2.474818468093872 - response_length/mean:1223.81884765625 - response_length/max:8192.0 - response_length/min:142.0 - response_length/clip_ratio:0.024234693497419357 - response_length_non_aborted/mean:1223.81884765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:142.0 - response_length_non_aborted/clip_ratio:0.024234693497419357 - response/aborted_ratio:0.0 - prompt_length/mean:241.3163299560547 - prompt_length/max:415.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010386388748884201 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1168629471212626) - timing_s/agent_loop/generate_sequences/max:np.float64(29.59211471490562) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.883526482885827) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.59211471490562) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:216 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.224509458988905 - timing_s/reward:0.00019337888807058334 - timing_s/old_log_prob:11.599094060249627 - timing_s/ref:21.958722826093435 - timing_s/adv:0.1053131790831685 - timing_s/update_actor:23.838549599051476 - timing_s/update_weights:28.80178425181657 - timing_s/step:117.99501678068191 - timing_s/stop_profile:5.808938294649124e-05 - timing_per_token_ms/adv:9.168302977816746e-05 - timing_per_token_ms/update_actor:0.02075324733129689 - timing_per_token_ms/gen:0.032543361736731696 - timing_per_token_ms/ref:0.019116716979603675 - perf/total_num_tokens:1600790 - perf/time_per_step:117.99501678068191 - perf/throughput:3391.647468840567 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:474.0 - frontier/mean_score:2.506195621904762 - frontier/mean_frontier_pct:0.21773445327790572 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.6899899999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.4119299999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.6878699999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:2.237 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.91 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.441253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9882350999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.8823509999999994 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:3.3023509999999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9423519899999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.5322950999999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.6569999999999996 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.9317299999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.2678699999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:3.0377299999999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.4820699999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.237 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.7598999999999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.0769999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.2600999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:3.7763509999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.0569999999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.7598999999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.3629999999999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.0659 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.7598999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.51 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:2.7598999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:1.763 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:238.0 - cluster/prob_snapshot/cluster_0:0.02337058980126796 - cluster/prob_snapshot/cluster_1:0.015275983582441205 - cluster/prob_snapshot/cluster_2:0.009436930399239365 - cluster/prob_snapshot/cluster_3:0.015897110941000537 - cluster/prob_snapshot/cluster_4:0.017023652424297653 - cluster/prob_snapshot/cluster_5:0.016828136960254352 - cluster/prob_snapshot/cluster_6:0.014168062619529168 - cluster/prob_snapshot/cluster_7:0.018635416738049287 - cluster/prob_snapshot/cluster_8:0.018255399847770453 - cluster/prob_snapshot/cluster_9:0.012097004739964555 - cluster/prob_snapshot/cluster_10:0.02812872177611612 - cluster/prob_snapshot/cluster_11:0.017479855173732024 - cluster/prob_snapshot/cluster_12:0.018926017889438977 - cluster/prob_snapshot/cluster_13:0.018255399847770453 - cluster/prob_snapshot/cluster_14:0.020915474188495643 - cluster/prob_snapshot/cluster_15:0.015897110941000537 - cluster/prob_snapshot/cluster_16:0.018635416738049287 - cluster/prob_snapshot/cluster_17:0.014819780833006833 - cluster/prob_snapshot/cluster_18:0.015897110941000537 - cluster/prob_snapshot/cluster_19:0.015897110941000537 - cluster/prob_snapshot/cluster_20:0.016038317187271733 - cluster/prob_snapshot/cluster_21:0.016828136960254352 - cluster/prob_snapshot/cluster_22:0.009436930399239365 - cluster/prob_snapshot/cluster_23:0.01639334780926282 - cluster/prob_snapshot/cluster_24:0.024901652589665358 - cluster/prob_snapshot/cluster_25:0.008505904379985548 - cluster/prob_snapshot/cluster_26:0.012667020670119952 - cluster/prob_snapshot/cluster_27:0.015089911382307477 - cluster/prob_snapshot/cluster_28:0.01496608492174672 - cluster/prob_snapshot/cluster_29:0.009436930399239365 - cluster/prob_snapshot/cluster_30:0.014363578083572463 - cluster/prob_snapshot/cluster_31:0.019239494350121737 - cluster/prob_snapshot/cluster_32:0.015720215997342312 - cluster/prob_snapshot/cluster_33:0.01381660979752635 - cluster/prob_snapshot/cluster_34:0.014168062619529168 - cluster/prob_snapshot/cluster_35:0.01948821130097954 - cluster/prob_snapshot/cluster_36:0.012820867969668893 - cluster/prob_snapshot/cluster_37:0.017479855173732024 - cluster/prob_snapshot/cluster_38:0.01948821130097954 - cluster/prob_snapshot/cluster_39:0.017936057923166396 - cluster/prob_snapshot/cluster_40:0.012667020670119952 - cluster/prob_snapshot/cluster_41:0.017936057923166396 - cluster/prob_snapshot/cluster_42:0.009436930399239365 - cluster/prob_snapshot/cluster_43:0.015897110941000537 - cluster/prob_snapshot/cluster_44:0.01431436670826905 - cluster/prob_snapshot/cluster_45:0.01496608492174672 - cluster/prob_snapshot/cluster_46:0.023917558087314075 - cluster/prob_snapshot/cluster_47:0.017936057923166396 - cluster/prob_snapshot/cluster_48:0.013028030759218368 - cluster/prob_snapshot/cluster_49:0.017479855173732024 - cluster/prob_snapshot/cluster_50:0.009436930399239365 - cluster/prob_snapshot/cluster_51:0.01496608492174672 - cluster/prob_snapshot/cluster_52:0.012667020670119952 - cluster/prob_snapshot/cluster_53:0.012667020670119952 - cluster/prob_snapshot/cluster_54:0.01941790933626038 - cluster/prob_snapshot/cluster_55:0.013028030759218368 - cluster/prob_snapshot/cluster_56:0.014567073770637944 - cluster/prob_snapshot/cluster_57:0.014567073770637944 - cluster/prob_snapshot/cluster_58:0.014455293646734467 - cluster/prob_snapshot/cluster_59:0.017479855173732024 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.015897110941000537 - cluster/prob_snapshot/cluster_62:0.017479855173732024 - cluster/prob_snapshot/cluster_63:0.011165978720710738
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|██▉       | 239/800 [7:43:20<17:45:42, 113.98s/it]
[36m(TaskRunner pid=2823680)[0m step:239 - global_seqlen/min:393560 - global_seqlen/max:446351 - global_seqlen/minmax_diff:52791 - global_seqlen/balanced_min:417425 - global_seqlen/balanced_max:417481 - global_seqlen/mean:417447.75 - frontier/skipped_zero_acc_count:21.0 - actor/entropy:np.float64(0.19067844734699638) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010769336484372616 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08702440926572308) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005798676155016572) - actor/ppo_kl:np.float64(-0.000387829285110094) - actor/pg_clipfrac_lower:np.float64(2.9846457874993104e-05) - actor/grad_norm:np.float64(0.2386909191097532) - perf/mfu/actor:np.float64(0.15112120251814623) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.90680694580078) - actor/lr:np.float64(1e-06) - training/global_step:239 - training/epoch:0 - critic/score/mean:0.639018714427948 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6332564353942871 - critic/rewards/max:1.2465013265609741 - critic/rewards/min:-0.07689154148101807 - critic/advantages/mean:-0.23072807490825653 - critic/advantages/max:2.4746899604797363 - critic/advantages/min:-2.47483229637146 - critic/returns/mean:-0.23072807490825653 - critic/returns/max:2.4746899604797363 - critic/returns/min:-2.47483229637146 - response_length/mean:1335.8983154296875 - response_length/max:8192.0 - response_length/min:222.0 - response_length/clip_ratio:0.029205607250332832 - response_length_non_aborted/mean:1335.8983154296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:222.0 - response_length_non_aborted/clip_ratio:0.029205607250332832 - response/aborted_ratio:0.0 - prompt_length/mean:235.45794677734375 - prompt_length/max:497.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.78279858827591e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.796583984978497) - timing_s/agent_loop/generate_sequences/max:np.float64(30.373468874022365) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.359424914530791) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.373468874022365) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:198 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.30759506020695 - timing_s/reward:0.00018762331455945969 - timing_s/old_log_prob:12.530815217643976 - timing_s/ref:14.062341643497348 - timing_s/adv:0.09854169748723507 - timing_s/update_actor:32.54843518324196 - timing_s/update_weights:24.55734878592193 - timing_s/step:116.60360673815012 - timing_s/stop_profile:5.315057933330536e-05 - timing_per_token_ms/adv:7.326079060460677e-05 - timing_per_token_ms/update_actor:0.0241981227771725 - timing_per_token_ms/gen:0.028252536717658188 - timing_per_token_ms/ref:0.010454642986925953 - perf/total_num_tokens:1669791 - perf/time_per_step:116.60360673815012 - perf/throughput:3580.058642074751 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:495.0 - frontier/mean_score:2.5621465880952377 - frontier/mean_frontier_pct:0.23302230397150697 - frontier/batch_easy_count:5.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.6899899999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.4119299999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.6878699999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:2.237 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.91 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.441253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9917645699999995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.9176456999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:3.8116456999999992 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9423519899999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.5322950999999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:3.3598999999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.6522109999999994 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.2678699999999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:3.0377299999999994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.4820699999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.4659 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:3.0769999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.7598999999999996 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.0769999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.49 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.2600999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:3.7763509999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.3629999999999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.0659 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.91 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:3.4319299999999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.6569999999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:3.4319299999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:239.0 - cluster/prob_snapshot/cluster_0:0.022860233724883456 - cluster/prob_snapshot/cluster_1:0.014942393753928371 - cluster/prob_snapshot/cluster_2:0.00923085109988817 - cluster/prob_snapshot/cluster_3:0.015549957221959267 - cluster/prob_snapshot/cluster_4:0.016651897816011015 - cluster/prob_snapshot/cluster_5:0.01646065192778716 - cluster/prob_snapshot/cluster_6:0.013858667053993181 - cluster/prob_snapshot/cluster_7:0.01822846516989909 - cluster/prob_snapshot/cluster_8:0.01785674691182132 - cluster/prob_snapshot/cluster_9:0.011832835973682152 - cluster/prob_snapshot/cluster_10:0.027514459825457475 - cluster/prob_snapshot/cluster_11:0.017098138221866684 - cluster/prob_snapshot/cluster_12:0.01853458608831608 - cluster/prob_snapshot/cluster_13:0.018075404710690596 - cluster/prob_snapshot/cluster_14:0.023613915370623496 - cluster/prob_snapshot/cluster_15:0.020177773176064277 - cluster/prob_snapshot/cluster_16:0.01822846516989909 - cluster/prob_snapshot/cluster_17:0.0144961533480727 - cluster/prob_snapshot/cluster_18:0.015549957221959267 - cluster/prob_snapshot/cluster_19:0.015549957221959267 - cluster/prob_snapshot/cluster_20:0.01568807987186337 - cluster/prob_snapshot/cluster_21:0.0208152594701438 - cluster/prob_snapshot/cluster_22:0.00923085109988817 - cluster/prob_snapshot/cluster_23:0.01603535750016553 - cluster/prob_snapshot/cluster_24:0.022626185185485686 - cluster/prob_snapshot/cluster_25:0.008320156394060278 - cluster/prob_snapshot/cluster_26:0.01239040416092372 - cluster/prob_snapshot/cluster_27:0.01476038491200648 - cluster/prob_snapshot/cluster_28:0.014639262516131372 - cluster/prob_snapshot/cluster_29:0.00923085109988817 - cluster/prob_snapshot/cluster_30:0.014049912942217035 - cluster/prob_snapshot/cluster_31:0.0188193512158814 - cluster/prob_snapshot/cluster_32:0.015376925227851964 - cluster/prob_snapshot/cluster_33:0.013514889095346268 - cluster/prob_snapshot/cluster_34:0.0152767488102109 - cluster/prob_snapshot/cluster_35:0.01906263680158114 - cluster/prob_snapshot/cluster_36:0.012540891814660218 - cluster/prob_snapshot/cluster_37:0.017098138221866684 - cluster/prob_snapshot/cluster_38:0.01906263680158114 - cluster/prob_snapshot/cluster_39:0.01785674691182132 - cluster/prob_snapshot/cluster_40:0.01239040416092372 - cluster/prob_snapshot/cluster_41:0.017544378627722353 - cluster/prob_snapshot/cluster_42:0.00923085109988817 - cluster/prob_snapshot/cluster_43:0.015549957221959267 - cluster/prob_snapshot/cluster_44:0.014001776222051846 - cluster/prob_snapshot/cluster_45:0.014639262516131372 - cluster/prob_snapshot/cluster_46:0.023395257571754223 - cluster/prob_snapshot/cluster_47:0.017544378627722353 - cluster/prob_snapshot/cluster_48:0.0144961533480727 - cluster/prob_snapshot/cluster_49:0.017544378627722353 - cluster/prob_snapshot/cluster_50:0.00923085109988817 - cluster/prob_snapshot/cluster_51:0.014639262516131372 - cluster/prob_snapshot/cluster_52:0.014248964785062277 - cluster/prob_snapshot/cluster_53:0.01239040416092372 - cluster/prob_snapshot/cluster_54:0.018993870058488016 - cluster/prob_snapshot/cluster_55:0.012743530679510043 - cluster/prob_snapshot/cluster_56:0.011832835973682152 - cluster/prob_snapshot/cluster_57:0.014248964785062277 - cluster/prob_snapshot/cluster_58:0.014139625663544202 - cluster/prob_snapshot/cluster_59:0.021261499875999467 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01646065192778716 - cluster/prob_snapshot/cluster_62:0.021261499875999467 - cluster/prob_snapshot/cluster_63:0.009504059511636537
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|███       | 240/800 [7:45:17<17:51:09, 114.77s/it]
[36m(TaskRunner pid=2823680)[0m step:240 - global_seqlen/min:388498 - global_seqlen/max:456784 - global_seqlen/minmax_diff:68286 - global_seqlen/balanced_min:417184 - global_seqlen/balanced_max:417317 - global_seqlen/mean:417235.0 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.1917250319859203) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012163580395281315 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.010598858225421282) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004658585427505943) - actor/ppo_kl:np.float64(5.7582294607353716e-05) - actor/pg_clipfrac_lower:np.float64(3.1150781551889164e-06) - actor/grad_norm:np.float64(0.25380782783031464) - perf/mfu/actor:np.float64(0.22696216137152012) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.72962951660156) - actor/lr:np.float64(1e-06) - training/global_step:240 - training/epoch:0 - critic/score/mean:0.6237244606018066 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6163972616195679 - critic/rewards/max:1.083272099494934 - critic/rewards/min:-0.06592047214508057 - critic/advantages/mean:-0.14621099829673767 - critic/advantages/max:2.474609851837158 - critic/advantages/min:-2.474837064743042 - critic/returns/mean:-0.14621099829673767 - critic/returns/max:2.474609851837158 - critic/returns/min:-2.474837064743042 - response_length/mean:1262.0982666015625 - response_length/max:8192.0 - response_length/min:146.0 - response_length/clip_ratio:0.01785714365541935 - response_length_non_aborted/mean:1262.0982666015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:146.0 - response_length_non_aborted/clip_ratio:0.01785714365541935 - response/aborted_ratio:0.0 - prompt_length/mean:241.03060913085938 - prompt_length/max:497.0 - prompt_length/min:187.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.613949805498123e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.628661872819066) - timing_s/agent_loop/generate_sequences/max:np.float64(30.17763653025031) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.685419104066568) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.17763653025031) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:215 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.37536397855729 - timing_s/reward:0.00020273122936487198 - timing_s/old_log_prob:10.407373942434788 - timing_s/ref:21.857347054407 - timing_s/adv:0.09866012632846832 - timing_s/update_actor:22.54143268428743 - timing_s/update_weights:28.73123622406274 - timing_s/step:116.40124867856503 - timing_s/stop_profile:6.026867777109146e-05 - timing_per_token_ms/adv:8.372003493433198e-05 - timing_per_token_ms/update_actor:0.019127986168551 - timing_per_token_ms/gen:0.032719408559561076 - timing_per_token_ms/ref:0.01854749154561701 - perf/total_num_tokens:1668940 - perf/time_per_step:116.40124867856503 - perf/throughput:3584.454674985224 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:525.0 - frontier/mean_score:2.5963952468095233 - frontier/mean_frontier_pct:0.24540891427946637 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:3.6899899999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:3.1883509999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.7815089999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:2.237 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9423519899999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.91 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.441253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.9176456999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:3.8116456999999992 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9423519899999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.6726065699999992 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:3.3598999999999997 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:1.343 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.4565476999999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.3629999999999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.4875089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:3.0264109999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.4820699999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.4659 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.6538999999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.8319299999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.0769999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.9429999999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.51 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.2600999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:3.7763509999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:2.8319299999999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.5540999999999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.0659 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.0569999999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.91 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.91 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:3.3023509999999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.6569999999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:3.3023509999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:240.0 - cluster/prob_snapshot/cluster_0:0.022558687824299015 - cluster/prob_snapshot/cluster_1:0.01949192677576134 - cluster/prob_snapshot/cluster_2:0.009109088333086414 - cluster/prob_snapshot/cluster_3:0.015344840077883825 - cluster/prob_snapshot/cluster_4:0.01700470549011735 - cluster/prob_snapshot/cluster_5:0.01624352194698698 - cluster/prob_snapshot/cluster_6:0.013675859463835108 - cluster/prob_snapshot/cluster_7:0.017988016230833955 - cluster/prob_snapshot/cluster_8:0.017621201252322118 - cluster/prob_snapshot/cluster_9:0.01167675081623829 - cluster/prob_snapshot/cluster_10:0.027151520729251703 - cluster/prob_snapshot/cluster_11:0.01687259925535919 - cluster/prob_snapshot/cluster_12:0.02197329256213384 - cluster/prob_snapshot/cluster_13:0.017836974769093787 - cluster/prob_snapshot/cluster_14:0.023302427768945633 - cluster/prob_snapshot/cluster_15:0.019911611208632517 - cluster/prob_snapshot/cluster_16:0.017988016230833955 - cluster/prob_snapshot/cluster_17:0.014304936772207311 - cluster/prob_snapshot/cluster_18:0.015344840077883825 - cluster/prob_snapshot/cluster_19:0.015344840077883825 - cluster/prob_snapshot/cluster_20:0.016338932433367176 - cluster/prob_snapshot/cluster_21:0.020540688517004725 - cluster/prob_snapshot/cluster_22:0.008210406463983258 - cluster/prob_snapshot/cluster_23:0.015823837514115806 - cluster/prob_snapshot/cluster_24:0.021131542501225956 - cluster/prob_snapshot/cluster_25:0.008210406463983258 - cluster/prob_snapshot/cluster_26:0.01222696420548512 - cluster/prob_snapshot/cluster_27:0.014565682897371385 - cluster/prob_snapshot/cluster_28:0.014446158208780667 - cluster/prob_snapshot/cluster_29:0.009109088333086414 - cluster/prob_snapshot/cluster_30:0.01520734175191104 - cluster/prob_snapshot/cluster_31:0.018501909484043212 - cluster/prob_snapshot/cluster_32:0.015174090522754223 - cluster/prob_snapshot/cluster_33:0.013336616228471815 - cluster/prob_snapshot/cluster_34:0.015075235517152879 - cluster/prob_snapshot/cluster_35:0.02233805225521104 - cluster/prob_snapshot/cluster_36:0.012375466799242838 - cluster/prob_snapshot/cluster_37:0.017312953371219734 - cluster/prob_snapshot/cluster_38:0.018811184430138853 - cluster/prob_snapshot/cluster_39:0.017836974769093787 - cluster/prob_snapshot/cluster_40:0.01222696420548512 - cluster/prob_snapshot/cluster_41:0.017312953371219734 - cluster/prob_snapshot/cluster_42:0.011878495725628794 - cluster/prob_snapshot/cluster_43:0.015344840077883825 - cluster/prob_snapshot/cluster_44:0.013817080900408457 - cluster/prob_snapshot/cluster_45:0.014446158208780667 - cluster/prob_snapshot/cluster_46:0.023086654252173967 - cluster/prob_snapshot/cluster_47:0.017312953371219734 - cluster/prob_snapshot/cluster_48:0.014304936772207311 - cluster/prob_snapshot/cluster_49:0.017312953371219734 - cluster/prob_snapshot/cluster_50:0.009109088333086414 - cluster/prob_snapshot/cluster_51:0.01561444463861477 - cluster/prob_snapshot/cluster_52:0.014061008836307887 - cluster/prob_snapshot/cluster_53:0.01222696420548512 - cluster/prob_snapshot/cluster_54:0.018743324778798417 - cluster/prob_snapshot/cluster_55:0.012575432685341443 - cluster/prob_snapshot/cluster_56:0.01167675081623829 - cluster/prob_snapshot/cluster_57:0.01167675081623829 - cluster/prob_snapshot/cluster_58:0.013953111990676581 - cluster/prob_snapshot/cluster_59:0.020188863735473992 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01624352194698698 - cluster/prob_snapshot/cluster_62:0.020188863735473992 - cluster/prob_snapshot/cluster_63:0.00937869289381736
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|███       | 241/800 [7:47:17<18:03:51, 116.33s/it]
[36m(TaskRunner pid=2823680)[0m step:241 - global_seqlen/min:347291 - global_seqlen/max:509110 - global_seqlen/minmax_diff:161819 - global_seqlen/balanced_min:435062 - global_seqlen/balanced_max:435218 - global_seqlen/mean:435165.0 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.19608208257704973) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012317848391830921 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.033669503915007226) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005560097173292888) - actor/ppo_kl:np.float64(-2.1323991191195546e-05) - actor/pg_clipfrac_lower:np.float64(2.6718726257968228e-05) - actor/grad_norm:np.float64(0.2750321305715121) - perf/mfu/actor:np.float64(0.23363317502914557) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.84872436523438) - actor/lr:np.float64(1e-06) - training/global_step:241 - training/epoch:0 - critic/score/mean:0.6212499737739563 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.613988995552063 - critic/rewards/max:1.27384614944458 - critic/rewards/min:-0.07485764473676682 - critic/advantages/mean:-0.09143629670143127 - critic/advantages/max:2.474771738052368 - critic/advantages/min:-2.4748342037200928 - critic/returns/mean:-0.09143629670143127 - critic/returns/max:2.474771738052368 - critic/returns/min:-2.4748342037200928 - response_length/mean:1259.188720703125 - response_length/max:8192.0 - response_length/min:199.0 - response_length/clip_ratio:0.02500000037252903 - response_length_non_aborted/mean:1259.188720703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:199.0 - response_length_non_aborted/clip_ratio:0.02500000037252903 - response/aborted_ratio:0.0 - prompt_length/mean:234.83999633789062 - prompt_length/max:366.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.670706301927567e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5442451648414135) - timing_s/agent_loop/generate_sequences/max:np.float64(31.416494474746287) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.569889115389742) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.416494474746287) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.20770249143243 - timing_s/reward:0.00011665374040603638 - timing_s/old_log_prob:11.370397311635315 - timing_s/ref:22.460623468272388 - timing_s/adv:0.08685561642050743 - timing_s/update_actor:22.245160759426653 - timing_s/update_weights:29.473590979352593 - timing_s/step:119.23290897998959 - timing_s/stop_profile:5.311518907546997e-05 - timing_per_token_ms/adv:7.266896338215331e-05 - timing_per_token_ms/update_actor:0.018611724138028345 - timing_per_token_ms/gen:0.032965374026960245 - timing_per_token_ms/ref:0.01879199401975396 - perf/total_num_tokens:1740660 - perf/time_per_step:119.23290897998959 - perf/throughput:3649.7054690918603 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:553.0 - frontier/mean_score:2.592478872714285 - frontier/mean_frontier_pct:0.25855849038770595 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4829929999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:3.1883509999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.7815089999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:2.237 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:1.91 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.441253 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:2.9176456999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:3.8116456999999992 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9423519899999997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.770824598999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:3.2519299999999993 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.4565476999999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.5540999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.0412562999999992 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:3.0264109999999995 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.037448999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.62613 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.6538999999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.8319299999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.0769999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.2600999999999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.6569999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.2600999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:3.5434456999999995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:3.4823509999999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.5540999999999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.04613 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:1.7398999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.91 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:1.91 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:3.3023509999999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.6569999999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:3.3023509999999994 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:241.0 - cluster/prob_snapshot/cluster_0:0.021325382342159648 - cluster/prob_snapshot/cluster_1:0.01952137259994696 - cluster/prob_snapshot/cluster_2:0.009122849138605182 - cluster/prob_snapshot/cluster_3:0.015368021032146984 - cluster/prob_snapshot/cluster_4:0.01703039394944467 - cluster/prob_snapshot/cluster_5:0.016268060510922125 - cluster/prob_snapshot/cluster_6:0.013696519142993151 - cluster/prob_snapshot/cluster_7:0.018121078890574484 - cluster/prob_snapshot/cluster_8:0.017647821031884416 - cluster/prob_snapshot/cluster_9:0.011694390506534159 - cluster/prob_snapshot/cluster_10:0.027192537654615892 - cluster/prob_snapshot/cluster_11:0.016898088146064725 - cluster/prob_snapshot/cluster_12:0.022006486905464143 - cluster/prob_snapshot/cluster_13:0.01786392051073833 - cluster/prob_snapshot/cluster_14:0.023337629993901435 - cluster/prob_snapshot/cluster_15:0.01994169103653495 - cluster/prob_snapshot/cluster_16:0.01801519014593607 - cluster/prob_snapshot/cluster_17:0.014326546778135744 - cluster/prob_snapshot/cluster_18:0.015368021032146984 - cluster/prob_snapshot/cluster_19:0.015368021032146984 - cluster/prob_snapshot/cluster_20:0.016964976380008854 - cluster/prob_snapshot/cluster_21:0.01991064885859352 - cluster/prob_snapshot/cluster_22:0.007592782024687441 - cluster/prob_snapshot/cluster_23:0.01584774207433413 - cluster/prob_snapshot/cluster_24:0.021163465239928 - cluster/prob_snapshot/cluster_25:0.00822280965983004 - cluster/prob_snapshot/cluster_26:0.012245435085376084 - cluster/prob_snapshot/cluster_27:0.014587686804048934 - cluster/prob_snapshot/cluster_28:0.015638032875779525 - cluster/prob_snapshot/cluster_29:0.009122849138605182 - cluster/prob_snapshot/cluster_30:0.01249803575713248 - cluster/prob_snapshot/cluster_31:0.018529859721084056 - cluster/prob_snapshot/cluster_32:0.012474724734632204 - cluster/prob_snapshot/cluster_33:0.013356763423831844 - cluster/prob_snapshot/cluster_34:0.016079052220379347 - cluster/prob_snapshot/cluster_35:0.022371797629227835 - cluster/prob_snapshot/cluster_36:0.012394162017205518 - cluster/prob_snapshot/cluster_37:0.017339107490664543 - cluster/prob_snapshot/cluster_38:0.0188396018788511 - cluster/prob_snapshot/cluster_39:0.01786392051073833 - cluster/prob_snapshot/cluster_40:0.012245435085376084 - cluster/prob_snapshot/cluster_41:0.017339107490664543 - cluster/prob_snapshot/cluster_42:0.013837953918229241 - cluster/prob_snapshot/cluster_43:0.016268060510922125 - cluster/prob_snapshot/cluster_44:0.013837953918229241 - cluster/prob_snapshot/cluster_45:0.014467981553371841 - cluster/prob_snapshot/cluster_46:0.021695517148952504 - cluster/prob_snapshot/cluster_47:0.017647821031884416 - cluster/prob_snapshot/cluster_48:0.014326546778135744 - cluster/prob_snapshot/cluster_49:0.02132145155749724 - cluster/prob_snapshot/cluster_50:0.009122849138605182 - cluster/prob_snapshot/cluster_51:0.015638032875779525 - cluster/prob_snapshot/cluster_52:0.014082250348182496 - cluster/prob_snapshot/cluster_53:0.012245435085376084 - cluster/prob_snapshot/cluster_54:0.018650593588308322 - cluster/prob_snapshot/cluster_55:0.010652916252522921 - cluster/prob_snapshot/cluster_56:0.011694390506534159 - cluster/prob_snapshot/cluster_57:0.011694390506534159 - cluster/prob_snapshot/cluster_58:0.013974190506271592 - cluster/prob_snapshot/cluster_59:0.020219362399813395 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.016268060510922125 - cluster/prob_snapshot/cluster_62:0.020219362399813395 - cluster/prob_snapshot/cluster_63:0.009392860982237725
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 19:19:47,008:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|███       | 242/800 [7:49:17<18:11:59, 117.42s/it]
[36m(TaskRunner pid=2823680)[0m step:242 - global_seqlen/min:371552 - global_seqlen/max:547587 - global_seqlen/minmax_diff:176035 - global_seqlen/balanced_min:450146 - global_seqlen/balanced_max:450239 - global_seqlen/mean:450191.25 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.1762823648020333) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011119123548269272 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.01818388440005947) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000633652679158224) - actor/ppo_kl:np.float64(0.0008802306355978187) - actor/pg_clipfrac_lower:np.float64(1.4560833269748472e-05) - actor/grad_norm:np.float64(0.269888332256904) - perf/mfu/actor:np.float64(0.22384741385868873) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.84405517578125) - actor/lr:np.float64(1e-06) - training/global_step:242 - training/epoch:0 - critic/score/mean:0.6213235259056091 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6166707873344421 - critic/rewards/max:1.0438776016235352 - critic/rewards/min:-0.06379793584346771 - critic/advantages/mean:-0.14289064705371857 - critic/advantages/max:2.474717617034912 - critic/advantages/min:-2.474849224090576 - critic/returns/mean:-0.14289064705371857 - critic/returns/max:2.474717617034912 - critic/returns/min:-2.474849224090576 - response_length/mean:1392.1470947265625 - response_length/max:8192.0 - response_length/min:192.0 - response_length/clip_ratio:0.033088237047195435 - response_length_non_aborted/mean:1392.1470947265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:192.0 - response_length_non_aborted/clip_ratio:0.033088237047195435 - response/aborted_ratio:0.0 - prompt_length/mean:242.12745666503906 - prompt_length/max:479.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.785778820514679e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4827438155189157) - timing_s/agent_loop/generate_sequences/max:np.float64(33.02101899776608) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.897267760286013) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.02101899776608) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.63618266116828 - timing_s/reward:0.0001374669373035431 - timing_s/old_log_prob:10.798049112781882 - timing_s/ref:21.256838579662144 - timing_s/adv:0.08655719179660082 - timing_s/update_actor:24.05912827141583 - timing_s/update_weights:28.481088960543275 - timing_s/step:119.72328530531377 - timing_s/stop_profile:5.559157580137253e-05 - timing_per_token_ms/adv:6.490647030867629e-05 - timing_per_token_ms/update_actor:0.018041170957473356 - timing_per_token_ms/gen:0.030489812129987075 - timing_per_token_ms/ref:0.015939823525806068 - perf/total_num_tokens:1800765 - perf/time_per_step:119.72328530531377 - perf/throughput:3760.2647542784966 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:579.0 - frontier/mean_score:2.609239519904761 - frontier/mean_frontier_pct:0.2763810271500635 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4829929999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:3.1883509999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:2.51 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.7815089999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:2.237 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8823509999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:2.237 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.608877099999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:3.568151989999999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.770824598999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:3.2519299999999993 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.4565476999999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.5540999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.0412562999999992 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:3.018487699999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.037448999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.7382909999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.6538999999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.8823509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:3.0769999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.2600999999999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:2.4820699999999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.3804119899999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:3.4823509999999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.5540999999999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.51 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.3 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.04613 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:1.7398999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.91 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.237 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:3.3023509999999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.6569999999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.211645699999999 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:242.0 - cluster/prob_snapshot/cluster_0:0.021188397137500477 - cluster/prob_snapshot/cluster_1:0.019395975588164197 - cluster/prob_snapshot/cluster_2:0.009064247827909995 - cluster/prob_snapshot/cluster_3:0.015269303387955763 - cluster/prob_snapshot/cluster_4:0.016920997927222882 - cluster/prob_snapshot/cluster_5:0.016163561395138828 - cluster/prob_snapshot/cluster_6:0.013608538517472926 - cluster/prob_snapshot/cluster_7:0.01800467677122946 - cluster/prob_snapshot/cluster_8:0.01753445892015047 - cluster/prob_snapshot/cluster_9:0.013608538517472926 - cluster/prob_snapshot/cluster_10:0.028037586740120208 - cluster/prob_snapshot/cluster_11:0.016789542000166975 - cluster/prob_snapshot/cluster_12:0.021865126574183484 - cluster/prob_snapshot/cluster_13:0.01789946821094238 - cluster/prob_snapshot/cluster_14:0.021706452298664577 - cluster/prob_snapshot/cluster_15:0.019813594077518693 - cluster/prob_snapshot/cluster_16:0.01800467677122946 - cluster/prob_snapshot/cluster_17:0.014234519122501068 - cluster/prob_snapshot/cluster_18:0.015269303387955763 - cluster/prob_snapshot/cluster_19:0.015269303387955763 - cluster/prob_snapshot/cluster_20:0.016856000572486794 - cluster/prob_snapshot/cluster_21:0.01978275130135258 - cluster/prob_snapshot/cluster_22:0.007544009215698782 - cluster/prob_snapshot/cluster_23:0.015745942905784335 - cluster/prob_snapshot/cluster_24:0.021027520122008245 - cluster/prob_snapshot/cluster_25:0.008169989820726928 - cluster/prob_snapshot/cluster_26:0.010341759266742946 - cluster/prob_snapshot/cluster_27:0.01449398169572804 - cluster/prob_snapshot/cluster_28:0.01553758079011068 - cluster/prob_snapshot/cluster_29:0.009064247827909995 - cluster/prob_snapshot/cluster_30:0.012417753680189655 - cluster/prob_snapshot/cluster_31:0.018362631260602703 - cluster/prob_snapshot/cluster_32:0.012394592397803614 - cluster/prob_snapshot/cluster_33:0.013270965244843018 - cluster/prob_snapshot/cluster_34:0.01665808607311106 - cluster/prob_snapshot/cluster_35:0.02222809069691297 - cluster/prob_snapshot/cluster_36:0.012314547181079027 - cluster/prob_snapshot/cluster_37:0.01753445892015047 - cluster/prob_snapshot/cluster_38:0.01871858427280473 - cluster/prob_snapshot/cluster_39:0.017749170267675123 - cluster/prob_snapshot/cluster_40:0.012166775607932879 - cluster/prob_snapshot/cluster_41:0.017227728423686674 - cluster/prob_snapshot/cluster_42:0.013749064775744547 - cluster/prob_snapshot/cluster_43:0.016789542000166975 - cluster/prob_snapshot/cluster_44:0.015099394366590977 - cluster/prob_snapshot/cluster_45:0.014375045380772694 - cluster/prob_snapshot/cluster_46:0.020564357072347918 - cluster/prob_snapshot/cluster_47:0.01753445892015047 - cluster/prob_snapshot/cluster_48:0.014234519122501068 - cluster/prob_snapshot/cluster_49:0.02118449160253033 - cluster/prob_snapshot/cluster_50:0.009064247827909995 - cluster/prob_snapshot/cluster_51:0.01553758079011068 - cluster/prob_snapshot/cluster_52:0.015269303387955763 - cluster/prob_snapshot/cluster_53:0.01399179194912281 - cluster/prob_snapshot/cluster_54:0.01853079009129629 - cluster/prob_snapshot/cluster_55:0.010584486440121206 - cluster/prob_snapshot/cluster_56:0.011619270705575899 - cluster/prob_snapshot/cluster_57:0.013608538517472926 - cluster/prob_snapshot/cluster_58:0.013884426237770604 - cluster/prob_snapshot/cluster_59:0.02008948179781637 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.016163561395138828 - cluster/prob_snapshot/cluster_62:0.019537686282041253 - cluster/prob_snapshot/cluster_63:0.009332525230064913
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 19:21:42,349:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|███       | 243/800 [7:51:11<18:00:21, 116.38s/it]
[36m(TaskRunner pid=2823680)[0m step:243 - global_seqlen/min:349794 - global_seqlen/max:432736 - global_seqlen/minmax_diff:82942 - global_seqlen/balanced_min:397905 - global_seqlen/balanced_max:398150 - global_seqlen/mean:398075.25 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.1884264669691523) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011875796131789684 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.08015584743043291) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00039896744245500787) - actor/ppo_kl:np.float64(4.88214896752955e-05) - actor/pg_clipfrac_lower:np.float64(1.6204858942122276e-06) - actor/grad_norm:np.float64(0.28121969227989513) - perf/mfu/actor:np.float64(0.22564326449801556) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.21162414550781) - actor/lr:np.float64(1e-06) - training/global_step:243 - training/epoch:0 - critic/score/mean:0.6565789580345154 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6473184823989868 - critic/rewards/max:1.2548129558563232 - critic/rewards/min:-1.3874330520629883 - critic/advantages/mean:-0.12848246097564697 - critic/advantages/max:2.4747443199157715 - critic/advantages/min:-2.4748497009277344 - critic/returns/mean:-0.12848246097564697 - critic/returns/max:2.4747443199157715 - critic/returns/min:-2.4748497009277344 - response_length/mean:1233.30126953125 - response_length/max:8192.0 - response_length/min:166.0 - response_length/clip_ratio:0.015789473429322243 - response_length_non_aborted/mean:1233.30126953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:166.0 - response_length_non_aborted/clip_ratio:0.015789473429322243 - response/aborted_ratio:0.0 - prompt_length/mean:248.45263671875 - prompt_length/max:758.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.004065603017807e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.366140391677618) - timing_s/agent_loop/generate_sequences/max:np.float64(29.41129956021905) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.252491204749276) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.41129956021905) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:223 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.609132817015052 - timing_s/reward:0.00028115417808294296 - timing_s/old_log_prob:9.67718002665788 - timing_s/ref:21.90186749678105 - timing_s/adv:0.08441572729498148 - timing_s/update_actor:20.709337445907295 - timing_s/update_weights:29.35030491091311 - timing_s/step:113.74679597839713 - timing_s/stop_profile:8.17812979221344e-05 - timing_per_token_ms/adv:7.496070827778023e-05 - timing_per_token_ms/update_actor:0.018389779400752215 - timing_per_token_ms/gen:0.03372327889416943 - timing_per_token_ms/ref:0.01944873962203492 - perf/total_num_tokens:1592301 - perf/time_per_step:113.74679597839713 - perf/throughput:3499.6612131000397 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:612.0 - frontier/mean_score:2.633614664190475 - frontier/mean_frontier_pct:0.29263035766439893 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4829929999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.5318456999999994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.2569999999999997 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:2.7815089999999993 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:2.237 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:3.5176456999999997 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:2.4659 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.608877099999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:3.568151989999999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.6569999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.770824598999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:3.2519299999999993 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.5883509999999994 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.4565476999999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.5540999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.0412562999999992 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:3.018487699999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.037448999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.7382909999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.4577299999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.024291 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.8823509999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:3.0538999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.6374489999999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.5540999999999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.3804119899999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:3.3376456999999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.6569999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.3 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.04613 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:1.7398999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.91 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.237 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.8116456999999992 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.7598999999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.1481519899999992 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:243.0 - cluster/prob_snapshot/cluster_0:0.02099229015023614 - cluster/prob_snapshot/cluster_1:0.015259645813249614 - cluster/prob_snapshot/cluster_2:0.008980354632883802 - cluster/prob_snapshot/cluster_3:0.019630211435773516 - cluster/prob_snapshot/cluster_4:0.016764387405743614 - cluster/prob_snapshot/cluster_5:0.016013961248035072 - cluster/prob_snapshot/cluster_6:0.013482586116618164 - cluster/prob_snapshot/cluster_7:0.01783803637387609 - cluster/prob_snapshot/cluster_8:0.021201144871703877 - cluster/prob_snapshot/cluster_9:0.01486218556324038 - cluster/prob_snapshot/cluster_10:0.02777808779689735 - cluster/prob_snapshot/cluster_11:0.016634148155232214 - cluster/prob_snapshot/cluster_12:0.021662756188599783 - cluster/prob_snapshot/cluster_13:0.01773380156038347 - cluster/prob_snapshot/cluster_14:0.021505550506194664 - cluster/prob_snapshot/cluster_15:0.019630211435773516 - cluster/prob_snapshot/cluster_16:0.01783803637387609 - cluster/prob_snapshot/cluster_17:0.014102773023815301 - cluster/prob_snapshot/cluster_18:0.016013961248035072 - cluster/prob_snapshot/cluster_19:0.015127979952039154 - cluster/prob_snapshot/cluster_20:0.01669999162720674 - cluster/prob_snapshot/cluster_21:0.019599654121687123 - cluster/prob_snapshot/cluster_22:0.0074741864296907394 - cluster/prob_snapshot/cluster_23:0.015600207982804976 - cluster/prob_snapshot/cluster_24:0.020832902115086472 - cluster/prob_snapshot/cluster_25:0.008094373336887882 - cluster/prob_snapshot/cluster_26:0.010246042198592256 - cluster/prob_snapshot/cluster_27:0.01435983416841069 - cluster/prob_snapshot/cluster_28:0.015393774340837928 - cluster/prob_snapshot/cluster_29:0.008980354632883802 - cluster/prob_snapshot/cluster_30:0.012302822463495464 - cluster/prob_snapshot/cluster_31:0.018192677853018632 - cluster/prob_snapshot/cluster_32:0.012279875547929168 - cluster/prob_snapshot/cluster_33:0.013148137218005168 - cluster/prob_snapshot/cluster_34:0.016503908904720813 - cluster/prob_snapshot/cluster_35:0.02084002793608141 - cluster/prob_snapshot/cluster_36:0.012200571181312068 - cluster/prob_snapshot/cluster_37:0.017372170574796814 - cluster/prob_snapshot/cluster_38:0.018406110747224053 - cluster/prob_snapshot/cluster_39:0.017584894683965433 - cluster/prob_snapshot/cluster_40:0.012054167292461478 - cluster/prob_snapshot/cluster_41:0.017068278990270212 - cluster/prob_snapshot/cluster_42:0.014959643505799926 - cluster/prob_snapshot/cluster_43:0.016634148155232214 - cluster/prob_snapshot/cluster_44:0.015896125735667614 - cluster/prob_snapshot/cluster_45:0.015393774340837928 - cluster/prob_snapshot/cluster_46:0.020374025822451305 - cluster/prob_snapshot/cluster_47:0.017372170574796814 - cluster/prob_snapshot/cluster_48:0.014102773023815301 - cluster/prob_snapshot/cluster_49:0.020116269815382344 - cluster/prob_snapshot/cluster_50:0.008980354632883802 - cluster/prob_snapshot/cluster_51:0.012583767132455771 - cluster/prob_snapshot/cluster_52:0.016013961248035072 - cluster/prob_snapshot/cluster_53:0.013862292386330698 - cluster/prob_snapshot/cluster_54:0.01835928030729284 - cluster/prob_snapshot/cluster_55:0.01048652283607686 - cluster/prob_snapshot/cluster_56:0.011511729764300711 - cluster/prob_snapshot/cluster_57:0.013482586116618164 - cluster/prob_snapshot/cluster_58:0.01375592038705837 - cluster/prob_snapshot/cluster_59:0.022973107463695713 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.016634148155232214 - cluster/prob_snapshot/cluster_62:0.018974175374777753 - cluster/prob_snapshot/cluster_63:0.009246149021682576
[36m(TaskRunner pid=2823680)[0m Training Progress:  30%|███       | 244/800 [7:53:02<17:42:42, 114.68s/it]
[36m(TaskRunner pid=2823680)[0m step:244 - global_seqlen/min:348932 - global_seqlen/max:433520 - global_seqlen/minmax_diff:84588 - global_seqlen/balanced_min:402752 - global_seqlen/balanced_max:402986 - global_seqlen/mean:402845.5 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.19748533552942368) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011989554390311241 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05537038354668766) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006665158180812097) - actor/ppo_kl:np.float64(0.00027665332155361985) - actor/pg_clipfrac_lower:np.float64(1.1345333527620263e-05) - actor/grad_norm:np.float64(0.28096127051573533) - perf/mfu/actor:np.float64(0.17715919556930118) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.8858413696289) - actor/lr:np.float64(1e-06) - training/global_step:244 - training/epoch:0 - critic/score/mean:0.5925480723381042 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5870094895362854 - critic/rewards/max:1.0261586904525757 - critic/rewards/min:-0.06284262984991074 - critic/advantages/mean:-0.14074279367923737 - critic/advantages/max:2.474630355834961 - critic/advantages/min:-2.474837303161621 - critic/returns/mean:-0.14074279367923737 - critic/returns/max:2.474630355834961 - critic/returns/min:-2.474837303161621 - response_length/mean:1290.967529296875 - response_length/max:8192.0 - response_length/min:115.0 - response_length/clip_ratio:0.024038461968302727 - response_length_non_aborted/mean:1290.967529296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:115.0 - response_length_non_aborted/clip_ratio:0.024038461968302727 - response/aborted_ratio:0.0 - prompt_length/mean:237.04808044433594 - prompt_length/max:431.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.346233516931534e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9267581263557076) - timing_s/agent_loop/generate_sequences/max:np.float64(29.57520097400993) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.057739814251363) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.57520097400993) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:225 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.593765448778868 - timing_s/reward:0.00016050226986408234 - timing_s/old_log_prob:11.896719738841057 - timing_s/ref:14.281503829173744 - timing_s/adv:0.14073811378329992 - timing_s/update_actor:26.886450714431703 - timing_s/update_weights:25.272472290322185 - timing_s/step:110.49411864113063 - timing_s/stop_profile:5.938578397035599e-05 - timing_per_token_ms/adv:0.00011070330956777614 - timing_per_token_ms/update_actor:0.02114863555157063 - timing_per_token_ms/gen:0.02941458585566214 - timing_per_token_ms/ref:0.011233699933827059 - perf/total_num_tokens:1611382 - perf/time_per_step:110.49411864113063 - perf/throughput:3645.8546839799283 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:636.0 - frontier/mean_score:2.635710515737704 - frontier/mean_frontier_pct:0.28650176401120075 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4829929999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.5318456999999994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.1798999999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.8659000000000001 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:3.5176456999999997 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:2.62613 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.608877099999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.397706392999999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.6569999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.770824598999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:3.1763509999999995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.7118456999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.4565476999999993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.6878699999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.328879409999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:3.018487699999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.3262142999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.7382909999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.4577299999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.7170036999999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:3.0538999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:2.8319299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.7598999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.6374489999999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.5540999999999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:3.8662883929999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.8363519899999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.6569999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.3 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.04613 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.1179299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:1.91 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.237 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.8116456999999992 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.7598999999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:244.0 - cluster/prob_snapshot/cluster_0:0.021663322113983453 - cluster/prob_snapshot/cluster_1:0.01574743013896494 - cluster/prob_snapshot/cluster_2:0.009267417405040822 - cluster/prob_snapshot/cluster_3:0.01977816148073108 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.016525857748451986 - cluster/prob_snapshot/cluster_6:0.011605418883265552 - cluster/prob_snapshot/cluster_7:0.018408240600841935 - cluster/prob_snapshot/cluster_8:0.021878853010031545 - cluster/prob_snapshot/cluster_9:0.016333854275100573 - cluster/prob_snapshot/cluster_10:0.02866603211693561 - cluster/prob_snapshot/cluster_11:0.01716586932629004 - cluster/prob_snapshot/cluster_12:0.022355220027532187 - cluster/prob_snapshot/cluster_13:0.018300673854954697 - cluster/prob_snapshot/cluster_14:0.021132861317923936 - cluster/prob_snapshot/cluster_15:0.020257703683367757 - cluster/prob_snapshot/cluster_16:0.018408240600841935 - cluster/prob_snapshot/cluster_17:0.014553577171849002 - cluster/prob_snapshot/cluster_18:0.016525857748451986 - cluster/prob_snapshot/cluster_19:0.015611555494397626 - cluster/prob_snapshot/cluster_20:0.0172338175269046 - cluster/prob_snapshot/cluster_21:0.01975608761202605 - cluster/prob_snapshot/cluster_22:0.007713103573148405 - cluster/prob_snapshot/cluster_23:0.01686698391943967 - cluster/prob_snapshot/cluster_24:0.02149883913847907 - cluster/prob_snapshot/cluster_25:0.008353115150986459 - cluster/prob_snapshot/cluster_26:0.010573563482261341 - cluster/prob_snapshot/cluster_27:0.01481885544013249 - cluster/prob_snapshot/cluster_28:0.016717861221803403 - cluster/prob_snapshot/cluster_29:0.009267417405040822 - cluster/prob_snapshot/cluster_30:0.014485031931862544 - cluster/prob_snapshot/cluster_31:0.01877421842139707 - cluster/prob_snapshot/cluster_32:0.014468455631996542 - cluster/prob_snapshot/cluster_33:0.013568425822720264 - cluster/prob_snapshot/cluster_34:0.017031466894944048 - cluster/prob_snapshot/cluster_35:0.021506192740893826 - cluster/prob_snapshot/cluster_36:0.010679322130133884 - cluster/prob_snapshot/cluster_37:0.01441516114020001 - cluster/prob_snapshot/cluster_38:0.018994473834398767 - cluster/prob_snapshot/cluster_39:0.018147007075115778 - cluster/prob_snapshot/cluster_40:0.012439486449719225 - cluster/prob_snapshot/cluster_41:0.017613877430776678 - cluster/prob_snapshot/cluster_42:0.015437838066127293 - cluster/prob_snapshot/cluster_43:0.01716586932629004 - cluster/prob_snapshot/cluster_44:0.016404255548662754 - cluster/prob_snapshot/cluster_45:0.015885846170613935 - cluster/prob_snapshot/cluster_46:0.024047321037715105 - cluster/prob_snapshot/cluster_47:0.017927483103917324 - cluster/prob_snapshot/cluster_48:0.014553577171849002 - cluster/prob_snapshot/cluster_49:0.023861124297979187 - cluster/prob_snapshot/cluster_50:0.009267417405040822 - cluster/prob_snapshot/cluster_51:0.012986015286887637 - cluster/prob_snapshot/cluster_52:0.016525857748451986 - cluster/prob_snapshot/cluster_53:0.014305409417177107 - cluster/prob_snapshot/cluster_54:0.01894614642954161 - cluster/prob_snapshot/cluster_55:0.013172980768226915 - cluster/prob_snapshot/cluster_56:0.011879709559481859 - cluster/prob_snapshot/cluster_57:0.013913565594010953 - cluster/prob_snapshot/cluster_58:0.014195637169001556 - cluster/prob_snapshot/cluster_59:0.023707457518140268 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01716586932629004 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.00954170808125713
[36m(TaskRunner pid=2823680)[0m Training Progress:  31%|███       | 245/800 [7:54:57<17:42:32, 114.87s/it]
[36m(TaskRunner pid=2823680)[0m step:245 - global_seqlen/min:387697 - global_seqlen/max:418889 - global_seqlen/minmax_diff:31192 - global_seqlen/balanced_min:406691 - global_seqlen/balanced_max:406771 - global_seqlen/mean:406735.25 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.20876839083126364) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011870993301272392 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03979517578409286) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000502888293875217) - actor/ppo_kl:np.float64(-0.00023580952531529858) - actor/pg_clipfrac_lower:np.float64(1.812442468351447e-05) - actor/grad_norm:np.float64(0.3360130053300124) - perf/mfu/actor:np.float64(0.19132892689416126) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.79206466674805) - actor/lr:np.float64(1e-06) - training/global_step:245 - training/epoch:0 - critic/score/mean:0.6446078419685364 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6385108232498169 - critic/rewards/max:1.0810350179672241 - critic/rewards/min:-0.04785413295030594 - critic/advantages/mean:-0.19061718881130219 - critic/advantages/max:2.47481632232666 - critic/advantages/min:-2.474637269973755 - critic/returns/mean:-0.19061718881130219 - critic/returns/max:2.47481632232666 - critic/returns/min:-2.474637269973755 - response_length/mean:1197.31982421875 - response_length/max:8192.0 - response_length/min:180.0 - response_length/clip_ratio:0.019607843831181526 - response_length_non_aborted/mean:1197.31982421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:180.0 - response_length_non_aborted/clip_ratio:0.019607843831181526 - response/aborted_ratio:0.0 - prompt_length/mean:230.57843017578125 - prompt_length/max:393.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.833501487970352e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4337228862568736) - timing_s/agent_loop/generate_sequences/max:np.float64(30.789570717141032) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.030481872150631) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.789570717141032) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.694744494743645 - timing_s/reward:0.0002851933240890503 - timing_s/old_log_prob:11.184804868884385 - timing_s/ref:19.028450057841837 - timing_s/adv:0.08084581978619099 - timing_s/update_actor:25.258440585806966 - timing_s/update_weights:26.33116199914366 - timing_s/step:115.09619425516576 - timing_s/stop_profile:6.289221346378326e-05 - timing_per_token_ms/adv:6.938572630158904e-05 - timing_per_token_ms/update_actor:0.02167799460660676 - timing_per_token_ms/gen:0.03346398102660215 - timing_per_token_ms/ref:0.016331120534724127 - perf/total_num_tokens:1626941 - perf/time_per_step:115.09619425516576 - perf/throughput:3533.8722764219015 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:662.0 - frontier/mean_score:2.669951832608196 - frontier/mean_frontier_pct:0.29727418740639433 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4829929999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.5318456999999994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.9429999999999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.7259299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.8659000000000001 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:3.5176456999999997 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:2.62613 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.608877099999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:3.278394475099999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.339899999999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.6569999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.770824598999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:3.1234456999999995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.7118456999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:3.319583389999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.7 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:3.012941389999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.3262142999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.7382909999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.4577299999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.7170036999999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:3.0538999999999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4823509999999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:3.4319299999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:2.6374489999999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.5540999999999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:3.8662883929999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.8363519899999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.6569999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.1179299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.237 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.237 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.2823509999999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.568151989999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.8319299999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:245.0 - cluster/prob_snapshot/cluster_0:0.021385496623683196 - cluster/prob_snapshot/cluster_1:0.015545474156576488 - cluster/prob_snapshot/cluster_2:0.011929975150629489 - cluster/prob_snapshot/cluster_3:0.022877124196080765 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01631391866969766 - cluster/prob_snapshot/cluster_6:0.011456582930293138 - cluster/prob_snapshot/cluster_7:0.01817216053687092 - cluster/prob_snapshot/cluster_8:0.02159826340181095 - cluster/prob_snapshot/cluster_9:0.016124377582255593 - cluster/prob_snapshot/cluster_10:0.02829839900942115 - cluster/prob_snapshot/cluster_11:0.016945722294504542 - cluster/prob_snapshot/cluster_12:0.022068521157790958 - cluster/prob_snapshot/cluster_13:0.018065973301649628 - cluster/prob_snapshot/cluster_14:0.020129266403450332 - cluster/prob_snapshot/cluster_15:0.019997904820175112 - cluster/prob_snapshot/cluster_16:0.01817216053687092 - cluster/prob_snapshot/cluster_17:0.01436693198917032 - cluster/prob_snapshot/cluster_18:0.01631391866969766 - cluster/prob_snapshot/cluster_19:0.015411342062830681 - cluster/prob_snapshot/cluster_20:0.01701279908019707 - cluster/prob_snapshot/cluster_21:0.01917788450094726 - cluster/prob_snapshot/cluster_22:0.0076141853753451515 - cluster/prob_snapshot/cluster_23:0.016650670001719724 - cluster/prob_snapshot/cluster_24:0.020382165390191657 - cluster/prob_snapshot/cluster_25:0.008245989000152035 - cluster/prob_snapshot/cluster_26:0.01043796075968612 - cluster/prob_snapshot/cluster_27:0.014628808144677012 - cluster/prob_snapshot/cluster_28:0.017078401055713986 - cluster/prob_snapshot/cluster_29:0.00914856560701901 - cluster/prob_snapshot/cluster_30:0.015535465300383631 - cluster/prob_snapshot/cluster_31:0.018499390588267144 - cluster/prob_snapshot/cluster_32:0.014282902107071005 - cluster/prob_snapshot/cluster_33:0.01339441490523653 - cluster/prob_snapshot/cluster_34:0.016813043533295095 - cluster/prob_snapshot/cluster_35:0.021230382386817345 - cluster/prob_snapshot/cluster_36:0.010542363085197575 - cluster/prob_snapshot/cluster_37:0.014230291100856038 - cluster/prob_snapshot/cluster_38:0.018750875508238493 - cluster/prob_snapshot/cluster_39:0.01791427725133349 - cluster/prob_snapshot/cluster_40:0.012279953834924848 - cluster/prob_snapshot/cluster_41:0.021381554758502185 - cluster/prob_snapshot/cluster_42:0.015239852507525953 - cluster/prob_snapshot/cluster_43:0.021071970982346814 - cluster/prob_snapshot/cluster_44:0.016193875980984346 - cluster/prob_snapshot/cluster_45:0.015682115044890775 - cluster/prob_snapshot/cluster_46:0.023738921489272886 - cluster/prob_snapshot/cluster_47:0.01791427725133349 - cluster/prob_snapshot/cluster_48:0.01436693198917032 - cluster/prob_snapshot/cluster_49:0.023555112665861033 - cluster/prob_snapshot/cluster_50:0.00914856560701901 - cluster/prob_snapshot/cluster_51:0.012819473606662269 - cluster/prob_snapshot/cluster_52:0.01631391866969766 - cluster/prob_snapshot/cluster_53:0.01172735591235323 - cluster/prob_snapshot/cluster_54:0.018618196747029046 - cluster/prob_snapshot/cluster_55:0.013004041312801187 - cluster/prob_snapshot/cluster_56:0.013735128364363443 - cluster/prob_snapshot/cluster_57:0.013735128364363443 - cluster/prob_snapshot/cluster_58:0.014013582457547276 - cluster/prob_snapshot/cluster_59:0.02190837085659761 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017387984831869357 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.009419338589079103
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 19:27:19,121:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  31%|███       | 246/800 [7:56:49<17:33:21, 114.08s/it]
[36m(TaskRunner pid=2823680)[0m step:246 - global_seqlen/min:364545 - global_seqlen/max:410180 - global_seqlen/minmax_diff:45635 - global_seqlen/balanced_min:382171 - global_seqlen/balanced_max:382276 - global_seqlen/mean:382230.75 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.21455108745023607) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011968422681093216 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.12384164950344712) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005262891769962152) - actor/ppo_kl:np.float64(-2.1048660705673683e-05) - actor/pg_clipfrac_lower:np.float64(2.853452206181828e-05) - actor/grad_norm:np.float64(0.275052941762484) - perf/mfu/actor:np.float64(0.20690631389200087) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.70530700683594) - actor/lr:np.float64(1e-06) - training/global_step:246 - training/epoch:0 - critic/score/mean:0.6388888955116272 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.635368287563324 - critic/rewards/max:1.4229027032852173 - critic/rewards/min:-0.07937254011631012 - critic/advantages/mean:-0.15822143852710724 - critic/advantages/max:2.47480845451355 - critic/advantages/min:-2.4748542308807373 - critic/returns/mean:-0.15822143852710724 - critic/returns/max:2.47480845451355 - critic/returns/min:-2.4748542308807373 - response_length/mean:1186.125 - response_length/max:8192.0 - response_length/min:215.0 - response_length/clip_ratio:0.022727273404598236 - response_length_non_aborted/mean:1186.125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:215.0 - response_length_non_aborted/clip_ratio:0.022727273404598236 - response/aborted_ratio:0.0 - prompt_length/mean:235.78787231445312 - prompt_length/max:360.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.17511060833931e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9597027348354459) - timing_s/agent_loop/generate_sequences/max:np.float64(29.347280515357852) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.603097570758109) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.347280515357852) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:218 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.304561654105783 - timing_s/reward:0.00017234403640031815 - timing_s/old_log_prob:11.07888543792069 - timing_s/ref:20.15415849816054 - timing_s/adv:0.12172438576817513 - timing_s/update_actor:21.88870724104345 - timing_s/update_weights:26.65329310670495 - timing_s/step:111.67918459884822 - timing_s/stop_profile:5.8379024267196655e-05 - timing_per_token_ms/adv:0.00010808848317343094 - timing_per_token_ms/update_actor:0.019436673673733588 - timing_per_token_ms/gen:0.033323605593404575 - timing_per_token_ms/ref:0.017896433881801833 - perf/total_num_tokens:1528923 - perf/time_per_step:111.67918459884822 - perf/throughput:3422.578266245168 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:691.0 - frontier/mean_score:2.6647177799929502 - frontier/mean_frontier_pct:0.31522849365863936 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4829929999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.5318456999999994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.9429999999999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.7259299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:1.8659000000000001 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:2.9596463929999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:3.3623519899999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.1382909999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:4.726213969999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.194876132569999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.9379299999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.770824598999999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:3.1234456999999995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.798291989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.223708372999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.343 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.09 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:3.012941389999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.3262142999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.7382909999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:3.4577299999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.7170036999999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:3.0377299999999994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.3376456999999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:3.4319299999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.146214299999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.5540999999999996 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:3.8662883929999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.8363519899999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.6569999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:2.1179299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.237 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.237 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:2.4976456999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.568151989999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.8823509999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:246.0 - cluster/prob_snapshot/cluster_0:0.02142750212812045 - cluster/prob_snapshot/cluster_1:0.015576008658306979 - cluster/prob_snapshot/cluster_2:0.011953408070282666 - cluster/prob_snapshot/cluster_3:0.0229220595632055 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0163459625541642 - cluster/prob_snapshot/cluster_6:0.011479086010468568 - cluster/prob_snapshot/cluster_7:0.018207854389742244 - cluster/prob_snapshot/cluster_8:0.020685314159751406 - cluster/prob_snapshot/cluster_9:0.013154845546069372 - cluster/prob_snapshot/cluster_10:0.02907584364944965 - cluster/prob_snapshot/cluster_11:0.021113360658077812 - cluster/prob_snapshot/cluster_12:0.022111868262594245 - cluster/prob_snapshot/cluster_13:0.01810145858099756 - cluster/prob_snapshot/cluster_14:0.01965499647278214 - cluster/prob_snapshot/cluster_15:0.019562862749712737 - cluster/prob_snapshot/cluster_16:0.018207854389742244 - cluster/prob_snapshot/cluster_17:0.011922217242224848 - cluster/prob_snapshot/cluster_18:0.01697900717095889 - cluster/prob_snapshot/cluster_19:0.015441613101600355 - cluster/prob_snapshot/cluster_20:0.017046215709225074 - cluster/prob_snapshot/cluster_21:0.019215553802094536 - cluster/prob_snapshot/cluster_22:0.00762914119812534 - cluster/prob_snapshot/cluster_23:0.01721519611748499 - cluster/prob_snapshot/cluster_24:0.01983237348536078 - cluster/prob_snapshot/cluster_25:0.00826218581492003 - cluster/prob_snapshot/cluster_26:0.012857757522846512 - cluster/prob_snapshot/cluster_27:0.0146575421262275 - cluster/prob_snapshot/cluster_28:0.017111946540485777 - cluster/prob_snapshot/cluster_29:0.009166535267483877 - cluster/prob_snapshot/cluster_30:0.015565980142666382 - cluster/prob_snapshot/cluster_31:0.018535727188118717 - cluster/prob_snapshot/cluster_32:0.014310956658171352 - cluster/prob_snapshot/cluster_33:0.01342072428512314 - cluster/prob_snapshot/cluster_34:0.016846067801432007 - cluster/prob_snapshot/cluster_35:0.021272083215058406 - cluster/prob_snapshot/cluster_36:0.010563070449966648 - cluster/prob_snapshot/cluster_37:0.014258242313142518 - cluster/prob_snapshot/cluster_38:0.01868822763630456 - cluster/prob_snapshot/cluster_39:0.017949464568505154 - cluster/prob_snapshot/cluster_40:0.012304074184542118 - cluster/prob_snapshot/cluster_41:0.020533320147258997 - cluster/prob_snapshot/cluster_42:0.015269786705613223 - cluster/prob_snapshot/cluster_43:0.021113360658077812 - cluster/prob_snapshot/cluster_44:0.01320358998156256 - cluster/prob_snapshot/cluster_45:0.015712917937369508 - cluster/prob_snapshot/cluster_46:0.023785549603153062 - cluster/prob_snapshot/cluster_47:0.017949464568505154 - cluster/prob_snapshot/cluster_48:0.014395151592205046 - cluster/prob_snapshot/cluster_49:0.023601379741487884 - cluster/prob_snapshot/cluster_50:0.009166535267483877 - cluster/prob_snapshot/cluster_51:0.012844653683839974 - cluster/prob_snapshot/cluster_52:0.0163459625541642 - cluster/prob_snapshot/cluster_53:0.011750390846237722 - cluster/prob_snapshot/cluster_54:0.018654766706559696 - cluster/prob_snapshot/cluster_55:0.01302958391883364 - cluster/prob_snapshot/cluster_56:0.013762106975410358 - cluster/prob_snapshot/cluster_57:0.013762106975410358 - cluster/prob_snapshot/cluster_58:0.01536560898975131 - cluster/prob_snapshot/cluster_59:0.021951403393340785 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017732330264944574 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.009437840103253029
[36m(TaskRunner pid=2823680)[0m Training Progress:  31%|███       | 247/800 [7:58:41<17:26:13, 113.52s/it]
[36m(TaskRunner pid=2823680)[0m step:247 - global_seqlen/min:334271 - global_seqlen/max:456438 - global_seqlen/minmax_diff:122167 - global_seqlen/balanced_min:396215 - global_seqlen/balanced_max:396503 - global_seqlen/mean:396424.75 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.2030508826505773) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011910970322787762 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.052147489637718536) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005538626651936543) - actor/ppo_kl:np.float64(0.00010941615097518637) - actor/pg_clipfrac_lower:np.float64(2.6186904617542477e-06) - actor/grad_norm:np.float64(0.2580433843227533) - perf/mfu/actor:np.float64(0.2200568408233492) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.6455307006836) - actor/lr:np.float64(1e-06) - training/global_step:247 - training/epoch:0 - critic/score/mean:0.6109693646430969 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6088550090789795 - critic/rewards/max:1.041306734085083 - critic/rewards/min:-0.0644633024930954 - critic/advantages/mean:-0.12992241978645325 - critic/advantages/max:2.474811553955078 - critic/advantages/min:-2.474843740463257 - critic/returns/mean:-0.12992241978645325 - critic/returns/max:2.474811553955078 - critic/returns/min:-2.474843740463257 - response_length/mean:1190.580322265625 - response_length/max:8192.0 - response_length/min:167.0 - response_length/clip_ratio:0.025510204955935478 - response_length_non_aborted/mean:1190.580322265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:167.0 - response_length_non_aborted/clip_ratio:0.025510204955935478 - response/aborted_ratio:0.0 - prompt_length/mean:234.65306091308594 - prompt_length/max:461.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.544139564037323e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5243230583146214) - timing_s/agent_loop/generate_sequences/max:np.float64(29.543135316111147) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.9730943775521155) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.543135316111147) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:222 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.135027756914496 - timing_s/reward:0.0002689119428396225 - timing_s/old_log_prob:10.152160876430571 - timing_s/ref:20.516699612140656 - timing_s/adv:0.09248793683946133 - timing_s/update_actor:21.37056590616703 - timing_s/update_weights:27.16220653243363 - timing_s/step:111.94144958257675 - timing_s/stop_profile:6.36465847492218e-05 - timing_per_token_ms/adv:8.277192049589203e-05 - timing_per_token_ms/update_actor:0.019125551316036694 - timing_per_token_ms/gen:0.03442737448714076 - timing_per_token_ms/ref:0.018361385140225558 - perf/total_num_tokens:1585699 - perf/time_per_step:111.94144958257675 - perf/throughput:3541.358017769514 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:721.0 - frontier/mean_score:2.6779327487044253 - frontier/mean_frontier_pct:0.33511204191659827 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4829929999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.5318456999999994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.7259299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.60613 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.9717524750999993 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:3.3623519899999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.1382909999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.8083497789999985 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.194876132569999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.9379299999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.839577219299999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:3.0864119899999993 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.798291989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.223708372999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.09 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:3.012941389999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.3262142999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.7382909999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:3.3204109999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.7170036999999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.4264109999999994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.2363519899999993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:3.3023509999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.6878699999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:3.8662883929999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.339899999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.8363519899999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.6569999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.3825509999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.237 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.237 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.6483519899999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.568151989999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.8823509999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:247.0 - cluster/prob_snapshot/cluster_0:0.02132176244129479 - cluster/prob_snapshot/cluster_1:0.015499144716459008 - cluster/prob_snapshot/cluster_2:0.010162598038179657 - cluster/prob_snapshot/cluster_3:0.02280894458670847 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01626529907080498 - cluster/prob_snapshot/cluster_6:0.00983221106382838 - cluster/prob_snapshot/cluster_7:0.018192112447085598 - cluster/prob_snapshot/cluster_8:0.020583236996110758 - cluster/prob_snapshot/cluster_9:0.013089929475126329 - cluster/prob_snapshot/cluster_10:0.029435170189113295 - cluster/prob_snapshot/cluster_11:0.021009171185573106 - cluster/prob_snapshot/cluster_12:0.022002751389743794 - cluster/prob_snapshot/cluster_13:0.01801213213734595 - cluster/prob_snapshot/cluster_14:0.019558003684767716 - cluster/prob_snapshot/cluster_15:0.019466324619967165 - cluster/prob_snapshot/cluster_16:0.018118002907781035 - cluster/prob_snapshot/cluster_17:0.011863383902252576 - cluster/prob_snapshot/cluster_18:0.016895219761202358 - cluster/prob_snapshot/cluster_19:0.0153654123702373 - cluster/prob_snapshot/cluster_20:0.01738297806042878 - cluster/prob_snapshot/cluster_21:0.018894021103902276 - cluster/prob_snapshot/cluster_22:0.007591493179414851 - cluster/prob_snapshot/cluster_23:0.01713024317078962 - cluster/prob_snapshot/cluster_24:0.019734505383478786 - cluster/prob_snapshot/cluster_25:0.011264500120507432 - cluster/prob_snapshot/cluster_26:0.012794307511472492 - cluster/prob_snapshot/cluster_27:0.014585210600845116 - cluster/prob_snapshot/cluster_28:0.017027503106185806 - cluster/prob_snapshot/cluster_29:0.009121300570379912 - cluster/prob_snapshot/cluster_30:0.015489165689186063 - cluster/prob_snapshot/cluster_31:0.018444257730958546 - cluster/prob_snapshot/cluster_32:0.014240335450614698 - cluster/prob_snapshot/cluster_33:0.013354496165093224 - cluster/prob_snapshot/cluster_34:0.01676293641621891 - cluster/prob_snapshot/cluster_35:0.020326487750466932 - cluster/prob_snapshot/cluster_36:0.010510944179969407 - cluster/prob_snapshot/cluster_37:0.014187881238488957 - cluster/prob_snapshot/cluster_38:0.014853707408238985 - cluster/prob_snapshot/cluster_39:0.01786088817958154 - cluster/prob_snapshot/cluster_40:0.01775286688194748 - cluster/prob_snapshot/cluster_41:0.019811905538481313 - cluster/prob_snapshot/cluster_42:0.015194433897129437 - cluster/prob_snapshot/cluster_43:0.020215930241540045 - cluster/prob_snapshot/cluster_44:0.014706413769439718 - cluster/prob_snapshot/cluster_45:0.01645427527792419 - cluster/prob_snapshot/cluster_46:0.023668173506257804 - cluster/prob_snapshot/cluster_47:0.01786088817958154 - cluster/prob_snapshot/cluster_48:0.014324114902437549 - cluster/prob_snapshot/cluster_49:0.023484912479573895 - cluster/prob_snapshot/cluster_50:0.009121300570379912 - cluster/prob_snapshot/cluster_51:0.012781268336831613 - cluster/prob_snapshot/cluster_52:0.01626529907080498 - cluster/prob_snapshot/cluster_53:0.01169240542914472 - cluster/prob_snapshot/cluster_54:0.018562709817354275 - cluster/prob_snapshot/cluster_55:0.014585210600845116 - cluster/prob_snapshot/cluster_56:0.013694194212040177 - cluster/prob_snapshot/cluster_57:0.013694194212040177 - cluster/prob_snapshot/cluster_58:0.016212358736210585 - cluster/prob_snapshot/cluster_59:0.02184307837690551 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017644825382775238 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.009391266580550216
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 19:31:11,820:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  31%|███       | 248/800 [8:00:41<17:42:06, 115.45s/it]
[36m(TaskRunner pid=2823680)[0m step:248 - global_seqlen/min:321257 - global_seqlen/max:444543 - global_seqlen/minmax_diff:123286 - global_seqlen/balanced_min:394966 - global_seqlen/balanced_max:395267 - global_seqlen/mean:395087.0 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.21249475744816493) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012205841019749641 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.03768486329136067) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007059032875574458) - actor/ppo_kl:np.float64(0.0011823330034449442) - actor/pg_clipfrac_lower:np.float64(3.067036149726812e-05) - actor/grad_norm:np.float64(0.2540034513388361) - perf/mfu/actor:np.float64(0.16249091023087736) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.7446060180664) - actor/lr:np.float64(1e-06) - training/global_step:248 - training/epoch:0 - critic/score/mean:0.6023809313774109 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5957428812980652 - critic/rewards/max:1.1479121446609497 - critic/rewards/min:-0.07146394997835159 - critic/advantages/mean:-0.13861672580242157 - critic/advantages/max:2.4747560024261475 - critic/advantages/min:-2.474839925765991 - critic/returns/mean:-0.13861672580242157 - critic/returns/max:2.4747560024261475 - critic/returns/min:-2.474839925765991 - response_length/mean:1242.55712890625 - response_length/max:8192.0 - response_length/min:190.0 - response_length/clip_ratio:0.02261904813349247 - response_length_non_aborted/mean:1242.55712890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:190.0 - response_length_non_aborted/clip_ratio:0.02261904813349247 - response/aborted_ratio:0.0 - prompt_length/mean:240.8000030517578 - prompt_length/max:411.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.289942681789398e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4460564889013767) - timing_s/agent_loop/generate_sequences/max:np.float64(30.35580259282142) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.992453084912086) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.35580259282142) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:202 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.2388666421175 - timing_s/reward:0.00018560420721769333 - timing_s/old_log_prob:11.869713941588998 - timing_s/ref:13.915414276532829 - timing_s/adv:0.11416768003255129 - timing_s/update_actor:28.57586675044149 - timing_s/update_weights:31.036621666513383 - timing_s/step:119.1911575468257 - timing_s/stop_profile:7.941760122776031e-05 - timing_per_token_ms/adv:9.162588083060568e-05 - timing_per_token_ms/update_actor:0.02293371434683351 - timing_per_token_ms/gen:0.03184568175662852 - timing_per_token_ms/ref:0.011167889982931918 - perf/total_num_tokens:1580348 - perf/time_per_step:119.1911575468257 - perf/throughput:3314.734147495675 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:744.0 - frontier/mean_score:2.6826757366568676 - frontier/mean_frontier_pct:0.34762490359568493 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.4829929999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.5318456999999994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.7259299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:1.60613 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:2.9802267325699994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.2536463929999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.1382909999999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.265844845299998 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.136413292798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.9379299999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:2.7598999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.839577219299999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.460488392999999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.798291989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.223708372999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.09 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:3.012941389999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.3262142999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8168036999999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.8242876999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.7170036999999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.5984876999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.1654463929999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:2.4820699999999993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:3.3023509999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.6878699999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:3.6064018750999995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.54235199 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.5379299999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.8363519899999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.7598999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.5677856999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4659 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.237 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.6483519899999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.568151989999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.8823509999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:248.0 - cluster/prob_snapshot/cluster_0:0.021284065428196247 - cluster/prob_snapshot/cluster_1:0.015471742128938336 - cluster/prob_snapshot/cluster_2:0.01014463049950103 - cluster/prob_snapshot/cluster_3:0.022768618226014017 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01623654191745933 - cluster/prob_snapshot/cluster_6:0.009814827651444848 - cluster/prob_snapshot/cluster_7:0.01821173363451474 - cluster/prob_snapshot/cluster_8:0.01988256155232776 - cluster/prob_snapshot/cluster_9:0.013066786395643971 - cluster/prob_snapshot/cluster_10:0.026067959595066335 - cluster/prob_snapshot/cluster_11:0.02097202683582469 - cluster/prob_snapshot/cluster_12:0.021963850383805526 - cluster/prob_snapshot/cluster_13:0.01798028657190624 - cluster/prob_snapshot/cluster_14:0.019166167067748437 - cluster/prob_snapshot/cluster_15:0.01943190803286749 - cluster/prob_snapshot/cluster_16:0.01808597016213843 - cluster/prob_snapshot/cluster_17:0.011842409363229943 - cluster/prob_snapshot/cluster_18:0.016865348904025596 - cluster/prob_snapshot/cluster_19:0.015338246222364668 - cluster/prob_snapshot/cluster_20:0.01735224484344262 - cluster/prob_snapshot/cluster_21:0.015035687967770658 - cluster/prob_snapshot/cluster_22:0.007578071370659134 - cluster/prob_snapshot/cluster_23:0.017099956790713466 - cluster/prob_snapshot/cluster_24:0.019699614651064776 - cluster/prob_snapshot/cluster_25:0.011244584411861843 - cluster/prob_snapshot/cluster_26:0.012771687093522771 - cluster/prob_snapshot/cluster_27:0.014559423854717592 - cluster/prob_snapshot/cluster_28:0.01699739837120451 - cluster/prob_snapshot/cluster_29:0.009105174052320062 - cluster/prob_snapshot/cluster_30:0.015461780744649775 - cluster/prob_snapshot/cluster_31:0.018411648164690694 - cluster/prob_snapshot/cluster_32:0.014215158445970384 - cluster/prob_snapshot/cluster_33:0.013330885330001799 - cluster/prob_snapshot/cluster_34:0.017213079167596738 - cluster/prob_snapshot/cluster_35:0.02336966787560186 - cluster/prob_snapshot/cluster_36:0.010492360763072174 - cluster/prob_snapshot/cluster_37:0.014162796973228968 - cluster/prob_snapshot/cluster_38:0.015878981732424722 - cluster/prob_snapshot/cluster_39:0.01782931001443168 - cluster/prob_snapshot/cluster_40:0.01772147969914643 - cluster/prob_snapshot/cluster_41:0.019343584135270958 - cluster/prob_snapshot/cluster_42:0.01516757004029668 - cluster/prob_snapshot/cluster_43:0.020180188346881345 - cluster/prob_snapshot/cluster_44:0.014680412735330762 - cluster/prob_snapshot/cluster_45:0.016425184013429208 - cluster/prob_snapshot/cluster_46:0.02203819917812009 - cluster/prob_snapshot/cluster_47:0.02164679961310895 - cluster/prob_snapshot/cluster_48:0.015508922404432652 - cluster/prob_snapshot/cluster_49:0.023443391003298276 - cluster/prob_snapshot/cluster_50:0.0082068783572254 - cluster/prob_snapshot/cluster_51:0.0127586709722265 - cluster/prob_snapshot/cluster_52:0.016865348904025596 - cluster/prob_snapshot/cluster_53:0.011671733181161959 - cluster/prob_snapshot/cluster_54:0.018529890827036006 - cluster/prob_snapshot/cluster_55:0.01569136626010638 - cluster/prob_snapshot/cluster_56:0.01506875751383627 - cluster/prob_snapshot/cluster_57:0.013669982788617438 - cluster/prob_snapshot/cluster_58:0.01618369518171691 - cluster/prob_snapshot/cluster_59:0.02180445967388066 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01761362921803945 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.009374662760848461
[36m(TaskRunner pid=2823680)[0m step:249 - global_seqlen/min:267449 - global_seqlen/max:450696 - global_seqlen/minmax_diff:183247 - global_seqlen/balanced_min:371959 - global_seqlen/balanced_max:372113 - global_seqlen/mean:372061.0 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.18700445752661182) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01202408503741026 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.008441462850896642) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005186318594496697) - actor/ppo_kl:np.float64(8.911807144785306e-06) - actor/pg_clipfrac_lower:np.float64(4.616942038959472e-05) - actor/grad_norm:np.float64(0.27281427809170317) - perf/mfu/actor:np.float64(0.1584326352907956) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.11502075195312) - actor/lr:np.float64(1e-06) - training/global_step:249 - training/epoch:0 - critic/score/mean:0.6285714507102966 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.623491108417511 - critic/rewards/max:1.1593835353851318 - critic/rewards/min:-0.1358780711889267 - critic/advantages/mean:-0.08907438814640045 - critic/advantages/max:2.4748473167419434 - critic/advantages/min:-2.4748520851135254 - critic/returns/mean:-0.08907438814640045 - critic/returns/max:2.4748473167419434 - critic/returns/min:-2.4748520851135254 - response_length/mean:1099.228515625 - response_length/max:8192.0 - response_length/min:183.0 - response_length/clip_ratio:0.01666666753590107 - response_length_non_aborted/mean:1099.228515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:183.0 - response_length_non_aborted/clip_ratio:0.01666666753590107 - response/aborted_ratio:0.0 - prompt_length/mean:229.4761962890625 - prompt_length/max:641.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.276291191577911e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.435847613029182) - timing_s/agent_loop/generate_sequences/max:np.float64(28.460938051342964) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.453443094897011) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.460938051342964) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:226 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.567124543711543 - timing_s/reward:0.0002626432105898857 - timing_s/old_log_prob:12.452395646832883 - timing_s/ref:13.366811932064593 - timing_s/adv:0.0924470704048872 - timing_s/update_actor:27.61566391494125 - timing_s/update_weights:28.235796851105988 - timing_s/step:112.76123834401369 - timing_s/stop_profile:6.982125341892242e-05 - timing_per_token_ms/adv:8.282956406246613e-05 - timing_per_token_ms/update_actor:0.024742735419869377 - timing_per_token_ms/gen:0.03310451977546108 - timing_per_token_ms/ref:0.011976228131284846 - perf/total_num_tokens:1488244 - perf/time_per_step:112.76123834401369 - perf/throughput:3299.5469495014827 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:767.0 - frontier/mean_score:2.6879204380634096 - frontier/mean_frontier_pct:0.36541371975673 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.3380950999999994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.072291989999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:4.108150999999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:2.024291 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.9861587127989995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.2536463929999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.265844845299998 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.5942351989999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.136413292798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:1.9379299999999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.4319299999999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.839577219299999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.460488392999999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.798291989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:3.223708372999999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.09 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:3.012941389999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.3262142999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.1815089999999993 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8168036999999995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.5770013899999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.7170036999999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.5984876999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.1158124750999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:2.037448999999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.6878699999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.4244813125699993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.379646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.5379299999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.8363519899999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.7598999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:2.5677856999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4659 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.4659 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.1538463929999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.568151989999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.8823509999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:249.0 - cluster/prob_snapshot/cluster_0:0.02035881334762645 - cluster/prob_snapshot/cluster_1:0.012638766890191763 - cluster/prob_snapshot/cluster_2:0.010124836179291198 - cluster/prob_snapshot/cluster_3:0.02505533153110735 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01620486098932396 - cluster/prob_snapshot/cluster_6:0.012346012140361157 - cluster/prob_snapshot/cluster_7:0.018212377430548127 - cluster/prob_snapshot/cluster_8:0.01984376646856617 - cluster/prob_snapshot/cluster_9:0.014617941579675322 - cluster/prob_snapshot/cluster_10:0.02601709549119709 - cluster/prob_snapshot/cluster_11:0.020931105974817683 - cluster/prob_snapshot/cluster_12:0.021920994265235284 - cluster/prob_snapshot/cluster_13:0.017945203229059365 - cluster/prob_snapshot/cluster_14:0.019128769821180135 - cluster/prob_snapshot/cluster_15:0.019393992269458513 - cluster/prob_snapshot/cluster_16:0.018050680608249558 - cluster/prob_snapshot/cluster_17:0.011819302317290396 - cluster/prob_snapshot/cluster_18:0.020931105974817683 - cluster/prob_snapshot/cluster_19:0.015308318059165656 - cluster/prob_snapshot/cluster_20:0.017318386942870745 - cluster/prob_snapshot/cluster_21:0.015006350159732817 - cluster/prob_snapshot/cluster_22:0.007563284950267463 - cluster/prob_snapshot/cluster_23:0.01706659115750422 - cluster/prob_snapshot/cluster_24:0.01966117653541013 - cluster/prob_snapshot/cluster_25:0.011222643848872798 - cluster/prob_snapshot/cluster_26:0.01274676683014192 - cluster/prob_snapshot/cluster_27:0.014531015338718402 - cluster/prob_snapshot/cluster_28:0.01696423285116805 - cluster/prob_snapshot/cluster_29:0.009087407931536584 - cluster/prob_snapshot/cluster_30:0.01543161153946395 - cluster/prob_snapshot/cluster_31:0.018375723144121378 - cluster/prob_snapshot/cluster_32:0.014187421664613302 - cluster/prob_snapshot/cluster_33:0.01330487395256271 - cluster/prob_snapshot/cluster_34:0.017179492808699057 - cluster/prob_snapshot/cluster_35:0.02181588644470026 - cluster/prob_snapshot/cluster_36:0.010471887947555478 - cluster/prob_snapshot/cluster_37:0.01413516236018232 - cluster/prob_snapshot/cluster_38:0.01584799847985252 - cluster/prob_snapshot/cluster_39:0.017794521258787655 - cluster/prob_snapshot/cluster_40:0.021529228186794726 - cluster/prob_snapshot/cluster_41:0.019003126845237833 - cluster/prob_snapshot/cluster_42:0.012426261881007568 - cluster/prob_snapshot/cluster_43:0.019587607119104268 - cluster/prob_snapshot/cluster_44:0.014651768144430194 - cluster/prob_snapshot/cluster_45:0.016393135004657205 - cluster/prob_snapshot/cluster_46:0.020885676940434514 - cluster/prob_snapshot/cluster_47:0.020612231837273294 - cluster/prob_snapshot/cluster_48:0.015478661215895731 - cluster/prob_snapshot/cluster_49:0.023397647987981312 - cluster/prob_snapshot/cluster_50:0.008190865001378278 - cluster/prob_snapshot/cluster_51:0.012733776106051871 - cluster/prob_snapshot/cluster_52:0.016832441040434777 - cluster/prob_snapshot/cluster_53:0.01164895916056032 - cluster/prob_snapshot/cluster_54:0.01849373509001812 - cluster/prob_snapshot/cluster_55:0.015660749085010885 - cluster/prob_snapshot/cluster_56:0.015039355180118164 - cluster/prob_snapshot/cluster_57:0.015039355180118164 - cluster/prob_snapshot/cluster_58:0.01313616160742259 - cluster/prob_snapshot/cluster_59:0.021761914560304726 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017579261301256647 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.009356370810584077
[36m(TaskRunner pid=2823680)[0m Training Progress:  31%|███       | 249/800 [8:02:34<17:33:23, 114.71s/it]
[36m(TaskRunner pid=2823680)[0m 
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_250
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 250}
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 19:38:28,706:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  31%|███▏      | 250/800 [8:07:57<27:04:46, 177.25s/it]
[36m(TaskRunner pid=2823680)[0m step:250 - global_seqlen/min:361107 - global_seqlen/max:415199 - global_seqlen/minmax_diff:54092 - global_seqlen/balanced_min:380252 - global_seqlen/balanced_max:380471 - global_seqlen/mean:380367.0 - frontier/skipped_zero_acc_count:22.0 - actor/entropy:np.float64(0.20003272828487856) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011515079997479916 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03463857449241914) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004966581114134004) - actor/ppo_kl:np.float64(-0.00022411316689117657) - actor/pg_clipfrac_lower:np.float64(2.0950277483382894e-05) - actor/grad_norm:np.float64(0.25643295688288553) - perf/mfu/actor:np.float64(0.13703755420587385) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.59751892089844) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.10208333333333333) - val-aux/aime2024/reward/std@16:np.float64(0.139972746481533) - val-aux/aime2024/reward/best@2/mean:np.float64(0.16066666666666668) - val-aux/aime2024/reward/best@2/std:np.float64(0.14623264685461873) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.04513333333333334) - val-aux/aime2024/reward/worst@2/std:np.float64(0.08927129119872775) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.10373333333333333) - val-aux/aime2024/reward/maj@2/std:np.float64(0.1405414429278894) - val-aux/aime2024/reward/best@4/mean:np.float64(0.2195) - val-aux/aime2024/reward/best@4/std:np.float64(0.13361467779625819) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.013166666666666667) - val-aux/aime2024/reward/worst@4/std:np.float64(0.04297093173966439) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.12556666666666666) - val-aux/aime2024/reward/maj@4/std:np.float64(0.13051752787133225) - val-aux/aime2024/reward/best@8/mean:np.float64(0.2748666666666667) - val-aux/aime2024/reward/best@8/std:np.float64(0.11014311461181706) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0013) - val-aux/aime2024/reward/worst@8/std:np.float64(0.01297479482193106) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.14486666666666667) - val-aux/aime2024/reward/maj@8/std:np.float64(0.11157074550804093) - val-aux/aime2024/reward/best@16/mean:np.float64(0.3219333333333333) - val-aux/aime2024/reward/best@16/std:np.float64(0.07832786080034504) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.16336666666666666) - val-aux/aime2024/reward/maj@16/std:np.float64(0.08591304131278933) - val-aux/aime2024/score/mean@16:np.float64(0.10208333333333333) - val-aux/aime2024/score/std@16:np.float64(0.139972746481533) - val-aux/aime2024/score/best@2/mean:np.float64(0.16066666666666668) - val-aux/aime2024/score/best@2/std:np.float64(0.14623264685461873) - val-aux/aime2024/score/worst@2/mean:np.float64(0.04513333333333334) - val-aux/aime2024/score/worst@2/std:np.float64(0.08927129119872775) - val-aux/aime2024/score/maj@2/mean:np.float64(0.10373333333333333) - val-aux/aime2024/score/maj@2/std:np.float64(0.1405414429278894) - val-aux/aime2024/score/best@4/mean:np.float64(0.2195) - val-aux/aime2024/score/best@4/std:np.float64(0.13361467779625819) - val-aux/aime2024/score/worst@4/mean:np.float64(0.013166666666666667) - val-aux/aime2024/score/worst@4/std:np.float64(0.04297093173966439) - val-aux/aime2024/score/maj@4/mean:np.float64(0.12556666666666666) - val-aux/aime2024/score/maj@4/std:np.float64(0.13051752787133225) - val-aux/aime2024/score/best@8/mean:np.float64(0.2748666666666667) - val-aux/aime2024/score/best@8/std:np.float64(0.11014311461181706) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0013) - val-aux/aime2024/score/worst@8/std:np.float64(0.01297479482193106) - val-aux/aime2024/score/maj@8/mean:np.float64(0.14486666666666667) - val-aux/aime2024/score/maj@8/std:np.float64(0.11157074550804093) - val-aux/aime2024/score/best@16/mean:np.float64(0.3219333333333333) - val-aux/aime2024/score/best@16/std:np.float64(0.07832786080034504) - val-aux/aime2024/score/worst@16/mean:np.float64(0.0) - val-aux/aime2024/score/worst@16/std:np.float64(0.0) - val-aux/aime2024/score/maj@16/mean:np.float64(0.16336666666666666) - val-aux/aime2024/score/maj@16/std:np.float64(0.08591304131278933) - val-core/aime2024/acc/mean@16:np.float64(0.10208333333333333) - val-aux/aime2024/acc/std@16:np.float64(0.139972746481533) - val-aux/aime2024/acc/best@2/mean:np.float64(0.16066666666666668) - val-aux/aime2024/acc/best@2/std:np.float64(0.14623264685461873) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.04513333333333334) - val-aux/aime2024/acc/worst@2/std:np.float64(0.08927129119872775) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.10373333333333333) - val-aux/aime2024/acc/maj@2/std:np.float64(0.1405414429278894) - val-aux/aime2024/acc/best@4/mean:np.float64(0.2195) - val-aux/aime2024/acc/best@4/std:np.float64(0.13361467779625819) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.013166666666666667) - val-aux/aime2024/acc/worst@4/std:np.float64(0.04297093173966439) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.12556666666666666) - val-aux/aime2024/acc/maj@4/std:np.float64(0.13051752787133225) - val-aux/aime2024/acc/best@8/mean:np.float64(0.2748666666666667) - val-aux/aime2024/acc/best@8/std:np.float64(0.11014311461181706) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0013) - val-aux/aime2024/acc/worst@8/std:np.float64(0.01297479482193106) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.14486666666666667) - val-aux/aime2024/acc/maj@8/std:np.float64(0.11157074550804093) - val-core/aime2024/acc/best@16/mean:np.float64(0.3219333333333333) - val-core/aime2024/acc/best@16/std:np.float64(0.07832786080034504) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0) - val-core/aime2024/acc/maj@16/mean:np.float64(0.16336666666666666) - val-core/aime2024/acc/maj@16/std:np.float64(0.08591304131278933) - val-aux/aime2025/reward/mean@16:np.float64(0.04375) - val-aux/aime2025/reward/std@16:np.float64(0.09659204961319766) - val-aux/aime2025/reward/best@2/mean:np.float64(0.07529999999999999) - val-aux/aime2025/reward/best@2/std:np.float64(0.1171224617720761) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.008266666666666667) - val-aux/aime2025/reward/worst@2/std:np.float64(0.04365649112143859) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.04136666666666667) - val-aux/aime2025/reward/maj@2/std:np.float64(0.09414339196593563) - val-aux/aime2025/reward/best@4/mean:np.float64(0.12856666666666666) - val-aux/aime2025/reward/best@4/std:np.float64(0.12853713935335787) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.00023333333333333333) - val-aux/aime2025/reward/worst@4/std:np.float64(0.005264660637597089) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.04573333333333333) - val-aux/aime2025/reward/maj@4/std:np.float64(0.09718365248197024) - val-aux/aime2025/reward/best@8/mean:np.float64(0.19146666666666665) - val-aux/aime2025/reward/best@8/std:np.float64(0.11254685820672333) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.05589999999999999) - val-aux/aime2025/reward/maj@8/std:np.float64(0.10062935939354865) - val-aux/aime2025/reward/best@16/mean:np.float64(0.241) - val-aux/aime2025/reward/best@16/std:np.float64(0.06618840614177873) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.06333333333333332) - val-aux/aime2025/reward/maj@16/std:np.float64(0.09640384001397236) - val-aux/aime2025/score/mean@16:np.float64(0.04375) - val-aux/aime2025/score/std@16:np.float64(0.09659204961319766) - val-aux/aime2025/score/best@2/mean:np.float64(0.07529999999999999) - val-aux/aime2025/score/best@2/std:np.float64(0.1171224617720761) - val-aux/aime2025/score/worst@2/mean:np.float64(0.008266666666666667) - val-aux/aime2025/score/worst@2/std:np.float64(0.04365649112143859) - val-aux/aime2025/score/maj@2/mean:np.float64(0.04136666666666667) - val-aux/aime2025/score/maj@2/std:np.float64(0.09414339196593563) - val-aux/aime2025/score/best@4/mean:np.float64(0.12856666666666666) - val-aux/aime2025/score/best@4/std:np.float64(0.12853713935335787) - val-aux/aime2025/score/worst@4/mean:np.float64(0.00023333333333333333) - val-aux/aime2025/score/worst@4/std:np.float64(0.005264660637597089) - val-aux/aime2025/score/maj@4/mean:np.float64(0.04573333333333333) - val-aux/aime2025/score/maj@4/std:np.float64(0.09718365248197024) - val-aux/aime2025/score/best@8/mean:np.float64(0.19146666666666665) - val-aux/aime2025/score/best@8/std:np.float64(0.11254685820672333) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0) - val-aux/aime2025/score/worst@8/std:np.float64(0.0) - val-aux/aime2025/score/maj@8/mean:np.float64(0.05589999999999999) - val-aux/aime2025/score/maj@8/std:np.float64(0.10062935939354865) - val-aux/aime2025/score/best@16/mean:np.float64(0.241) - val-aux/aime2025/score/best@16/std:np.float64(0.06618840614177873) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.06333333333333332) - val-aux/aime2025/score/maj@16/std:np.float64(0.09640384001397236) - val-core/aime2025/acc/mean@16:np.float64(0.04375) - val-aux/aime2025/acc/std@16:np.float64(0.09659204961319766) - val-aux/aime2025/acc/best@2/mean:np.float64(0.07529999999999999) - val-aux/aime2025/acc/best@2/std:np.float64(0.1171224617720761) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.008266666666666667) - val-aux/aime2025/acc/worst@2/std:np.float64(0.04365649112143859) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.04136666666666667) - val-aux/aime2025/acc/maj@2/std:np.float64(0.09414339196593563) - val-aux/aime2025/acc/best@4/mean:np.float64(0.12856666666666666) - val-aux/aime2025/acc/best@4/std:np.float64(0.12853713935335787) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.00023333333333333333) - val-aux/aime2025/acc/worst@4/std:np.float64(0.005264660637597089) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.04573333333333333) - val-aux/aime2025/acc/maj@4/std:np.float64(0.09718365248197024) - val-aux/aime2025/acc/best@8/mean:np.float64(0.19146666666666665) - val-aux/aime2025/acc/best@8/std:np.float64(0.11254685820672333) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.05589999999999999) - val-aux/aime2025/acc/maj@8/std:np.float64(0.10062935939354865) - val-core/aime2025/acc/best@16/mean:np.float64(0.241) - val-core/aime2025/acc/best@16/std:np.float64(0.06618840614177873) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.06333333333333332) - val-core/aime2025/acc/maj@16/std:np.float64(0.09640384001397236) - val-aux/math500/reward/mean@4:np.float64(0.6915) - val-aux/math500/reward/std@4:np.float64(0.13500446416709053) - val-aux/math500/reward/best@2/mean:np.float64(0.7533540000000001) - val-aux/math500/reward/best@2/std:np.float64(0.10891562107524837) - val-aux/math500/reward/worst@2/mean:np.float64(0.630068) - val-aux/math500/reward/worst@2/std:np.float64(0.12285308282054336) - val-aux/math500/reward/maj@2/mean:np.float64(0.69179) - val-aux/math500/reward/maj@2/std:np.float64(0.1350890093692068) - val-aux/math500/reward/best@4/mean:np.float64(0.7963520000000001) - val-aux/math500/reward/best@4/std:np.float64(0.06480904349541625) - val-aux/math500/reward/worst@4/mean:np.float64(0.57695) - val-aux/math500/reward/worst@4/std:np.float64(0.08704381419247044) - val-aux/math500/reward/maj@4/mean:np.float64(0.7081320000000001) - val-aux/math500/reward/maj@4/std:np.float64(0.12348129363042079) - val-aux/math500/score/mean@4:np.float64(0.6915) - val-aux/math500/score/std@4:np.float64(0.13500446416709053) - val-aux/math500/score/best@2/mean:np.float64(0.7533540000000001) - val-aux/math500/score/best@2/std:np.float64(0.10891562107524837) - val-aux/math500/score/worst@2/mean:np.float64(0.630068) - val-aux/math500/score/worst@2/std:np.float64(0.12285308282054336) - val-aux/math500/score/maj@2/mean:np.float64(0.69179) - val-aux/math500/score/maj@2/std:np.float64(0.1350890093692068) - val-aux/math500/score/best@4/mean:np.float64(0.7963520000000001) - val-aux/math500/score/best@4/std:np.float64(0.06480904349541625) - val-aux/math500/score/worst@4/mean:np.float64(0.57695) - val-aux/math500/score/worst@4/std:np.float64(0.08704381419247044) - val-aux/math500/score/maj@4/mean:np.float64(0.7081320000000001) - val-aux/math500/score/maj@4/std:np.float64(0.12348129363042079) - val-core/math500/acc/mean@4:np.float64(0.6915) - val-aux/math500/acc/std@4:np.float64(0.13500446416709053) - val-aux/math500/acc/best@2/mean:np.float64(0.7533540000000001) - val-aux/math500/acc/best@2/std:np.float64(0.10891562107524837) - val-aux/math500/acc/worst@2/mean:np.float64(0.630068) - val-aux/math500/acc/worst@2/std:np.float64(0.12285308282054336) - val-aux/math500/acc/maj@2/mean:np.float64(0.69179) - val-aux/math500/acc/maj@2/std:np.float64(0.1350890093692068) - val-core/math500/acc/best@4/mean:np.float64(0.7963520000000001) - val-core/math500/acc/best@4/std:np.float64(0.06480904349541625) - val-aux/math500/acc/worst@4/mean:np.float64(0.57695) - val-aux/math500/acc/worst@4/std:np.float64(0.08704381419247044) - val-core/math500/acc/maj@4/mean:np.float64(0.7081320000000001) - val-core/math500/acc/maj@4/std:np.float64(0.12348129363042079) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.07601351351351351 - val-aux/aime2024/response_length/clip_ratio:0.18958333333333333 - val-aux/aime2025/response_length/clip_ratio:0.1125 - val-aux/math500/response_length/clip_ratio:0.04 - training/global_step:250 - training/epoch:0 - critic/score/mean:0.6214622855186462 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6177131533622742 - critic/rewards/max:1.0692999362945557 - critic/rewards/min:-0.11017207056283951 - critic/advantages/mean:-0.08407039940357208 - critic/advantages/max:2.4748306274414062 - critic/advantages/min:-2.47485613822937 - critic/returns/mean:-0.08407039940357208 - critic/returns/max:2.4748306274414062 - critic/returns/min:-2.47485613822937 - response_length/mean:1199.85498046875 - response_length/max:8192.0 - response_length/min:60.0 - response_length/clip_ratio:0.01886792480945587 - response_length_non_aborted/mean:1199.85498046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:60.0 - response_length_non_aborted/clip_ratio:0.01886792480945587 - response/aborted_ratio:0.0 - prompt_length/mean:233.632080078125 - prompt_length/max:369.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.448120206594467e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.6552031664177775) - timing_s/agent_loop/generate_sequences/max:np.float64(28.90720174741) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.6808235041789885) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.90720174741) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:185 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.191442635841668 - timing_s/reward:0.00012498628348112106 - timing_s/old_log_prob:14.796511294320226 - timing_s/ref:14.53470200765878 - timing_s/adv:0.09525289572775364 - timing_s/update_actor:32.44599993620068 - timing_s/save_checkpoint:51.07996885664761 - timing_s/update_weights:32.91498655453324 - timing_s/step:177.47636589873582 - timing_s/testing:145.44582308549434 - timing_s/stop_profile:0.00044676847755908966 - timing_per_token_ms/adv:7.835894274809302e-05 - timing_per_token_ms/update_actor:0.026691411657153382 - timing_per_token_ms/gen:0.030655673431283132 - timing_per_token_ms/ref:0.011956842611209785 - perf/total_num_tokens:1521468 - perf/time_per_step:177.47636589873582 - perf/throughput:2143.198042589114 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:789.0 - frontier/mean_score:2.698272642153573 - frontier/mean_frontier_pct:0.38174627245011566 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.8366665699999993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.072291989999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.775705699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:2.1598999999999995 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:2.024291 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.9861587127989995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.2536463929999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.265844845299998 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.4159646392999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9423519899999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.136413292798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.9596463929999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.6565509999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.4319299999999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.839577219299999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.622341875099999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.858804392999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.756595861099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:3.012941389999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.3262142999999993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4270562999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.4039009729999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:1.7170036999999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.5984876999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.1158124750999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:2.037448999999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.6878699999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.4244813125699993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.379646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.5379299999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.8363519899999994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.7598999999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.6974499899999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.62613 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.4659 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.1538463929999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.8823509999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.5340999999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:250.0 - cluster/prob_snapshot/cluster_0:0.0233097917149281 - cluster/prob_snapshot/cluster_1:0.012590276944345948 - cluster/prob_snapshot/cluster_2:0.010085991190512065 - cluster/prob_snapshot/cluster_3:0.022939421979498935 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013122542239857241 - cluster/prob_snapshot/cluster_6:0.012298645378611443 - cluster/prob_snapshot/cluster_7:0.018142503747220984 - cluster/prob_snapshot/cluster_8:0.0197676337912411 - cluster/prob_snapshot/cluster_9:0.014561858323948388 - cluster/prob_snapshot/cluster_10:0.02591727819395644 - cluster/prob_snapshot/cluster_11:0.020850801606200873 - cluster/prob_snapshot/cluster_12:0.020753803541401432 - cluster/prob_snapshot/cluster_13:0.0178763545873897 - cluster/prob_snapshot/cluster_14:0.01905538030297912 - cluster/prob_snapshot/cluster_15:0.019319585197704544 - cluster/prob_snapshot/cluster_16:0.01798142729162629 - cluster/prob_snapshot/cluster_17:0.010064429126338141 - cluster/prob_snapshot/cluster_18:0.020850801606200873 - cluster/prob_snapshot/cluster_19:0.01614268935196106 - cluster/prob_snapshot/cluster_20:0.01725194314718302 - cluster/prob_snapshot/cluster_21:0.015932122794271118 - cluster/prob_snapshot/cluster_22:0.007534267619633764 - cluster/prob_snapshot/cluster_23:0.017368758462258407 - cluster/prob_snapshot/cluster_24:0.022823319535792228 - cluster/prob_snapshot/cluster_25:0.011179587006602765 - cluster/prob_snapshot/cluster_26:0.014356482852346248 - cluster/prob_snapshot/cluster_27:0.01447526558457063 - cluster/prob_snapshot/cluster_28:0.016899147804547928 - cluster/prob_snapshot/cluster_29:0.009052543144306353 - cluster/prob_snapshot/cluster_30:0.015372406554170413 - cluster/prob_snapshot/cluster_31:0.018305222767947212 - cluster/prob_snapshot/cluster_32:0.01413299014339087 - cluster/prob_snapshot/cluster_33:0.01474565897275875 - cluster/prob_snapshot/cluster_34:0.017447486406832184 - cluster/prob_snapshot/cluster_35:0.02068051034699924 - cluster/prob_snapshot/cluster_36:0.01043171145851251 - cluster/prob_snapshot/cluster_37:0.014080931337225567 - cluster/prob_snapshot/cluster_38:0.01578719598268415 - cluster/prob_snapshot/cluster_39:0.017726250724194568 - cluster/prob_snapshot/cluster_40:0.021446629060000957 - cluster/prob_snapshot/cluster_41:0.018930219369403162 - cluster/prob_snapshot/cluster_42:0.01237858723276767 - cluster/prob_snapshot/cluster_43:0.019512457223809378 - cluster/prob_snapshot/cluster_44:0.01459555510956362 - cluster/prob_snapshot/cluster_45:0.01633024103442061 - cluster/prob_snapshot/cluster_46:0.020805546865040787 - cluster/prob_snapshot/cluster_47:0.020533150862504592 - cluster/prob_snapshot/cluster_48:0.015419275719617058 - cluster/prob_snapshot/cluster_49:0.02330788047397351 - cluster/prob_snapshot/cluster_50:0.008159439894498948 - cluster/prob_snapshot/cluster_51:0.012684921647451613 - cluster/prob_snapshot/cluster_52:0.016767861626826242 - cluster/prob_snapshot/cluster_53:0.011604266715184653 - cluster/prob_snapshot/cluster_54:0.018422781948719363 - cluster/prob_snapshot/cluster_55:0.01638844457321056 - cluster/prob_snapshot/cluster_56:0.015955137669501504 - cluster/prob_snapshot/cluster_57:0.014981655127211434 - cluster/prob_snapshot/cluster_58:0.013085763354926921 - cluster/prob_snapshot/cluster_59:0.020642874976052356 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017511816633915808 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.009320474119248573
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 19:40:44,733:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  31%|███▏      | 251/800 [8:10:15<25:13:14, 165.38s/it]
[36m(TaskRunner pid=2823680)[0m step:251 - global_seqlen/min:356811 - global_seqlen/max:444494 - global_seqlen/minmax_diff:87683 - global_seqlen/balanced_min:398827 - global_seqlen/balanced_max:398988 - global_seqlen/mean:398910.75 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.2515498632565141) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01117763016372919 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03931334818480536) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000583925949641802) - actor/ppo_kl:np.float64(-0.000640362668036687) - actor/pg_clipfrac_lower:np.float64(9.975715123721824e-05) - actor/grad_norm:np.float64(0.28477194036046666) - perf/mfu/actor:np.float64(0.22676002202980886) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.17851638793945) - actor/lr:np.float64(1e-06) - training/global_step:251 - training/epoch:0 - critic/score/mean:0.5714285969734192 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5713111758232117 - critic/rewards/max:1.108446717262268 - critic/rewards/min:-0.048471637070178986 - critic/advantages/mean:-0.15637890994548798 - critic/advantages/max:2.4747235774993896 - critic/advantages/min:-2.474834680557251 - critic/returns/mean:-0.15637890994548798 - critic/returns/max:2.4747235774993896 - critic/returns/min:-2.474834680557251 - response_length/mean:1243.88330078125 - response_length/max:8192.0 - response_length/min:222.0 - response_length/clip_ratio:0.02747252769768238 - response_length_non_aborted/mean:1243.88330078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:222.0 - response_length_non_aborted/clip_ratio:0.02747252769768238 - response/aborted_ratio:0.0 - prompt_length/mean:239.7142791748047 - prompt_length/max:360.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.676612585783005e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7079404685646296) - timing_s/agent_loop/generate_sequences/max:np.float64(30.13260002154857) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.021832028807694) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.13260002154857) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.364288030192256 - timing_s/reward:0.00027325190603733063 - timing_s/old_log_prob:17.94617484882474 - timing_s/ref:27.063651824370027 - timing_s/adv:0.09104307834059 - timing_s/update_actor:20.772157735191286 - timing_s/update_weights:37.82471461314708 - timing_s/step:137.47732020448893 - timing_s/stop_profile:6.612390279769897e-05 - timing_per_token_ms/adv:8.429454163206825e-05 - timing_per_token_ms/update_actor:0.019232428723978307 - timing_per_token_ms/gen:0.036844347151713 - timing_per_token_ms/ref:0.025057567988758047 - perf/total_num_tokens:1595643 - perf/time_per_step:137.47732020448893 - perf/throughput:2901.647700192622 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:826.0 - frontier/mean_score:2.703684100195704 - frontier/mean_frontier_pct:0.3967690063333507 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.585666598999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.3506043929999993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.775705699999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.8119299999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:2.024291 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:2.9861587127989995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.2536463929999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.265844845299998 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:3.4159646392999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.5596463929999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:3.136413292798999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.9717524750999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.6565509999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.4319299999999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.839577219299999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.735639312569999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.858804392999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.756595861099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.3825509999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.9429999999999998 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:1.9283500099999995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4270562999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:2.8717625899999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.4039009729999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.5019025899999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.5984876999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.1158124750999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:2.037448999999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.6878699999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.4244813125699993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.379646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.676550999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5854463929999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.8319299999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.6974499899999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7382909999999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.62613 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.1538463929999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.8823509999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.9738699999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:251.0 - cluster/prob_snapshot/cluster_0:0.02174123066059487 - cluster/prob_snapshot/cluster_1:0.014252588992594342 - cluster/prob_snapshot/cluster_2:0.01006580395113125 - cluster/prob_snapshot/cluster_3:0.022893508435256175 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.010986405730482044 - cluster/prob_snapshot/cluster_6:0.012274029483789788 - cluster/prob_snapshot/cluster_7:0.018106191295703376 - cluster/prob_snapshot/cluster_8:0.019728068621313977 - cluster/prob_snapshot/cluster_9:0.014532712579691581 - cluster/prob_snapshot/cluster_10:0.025865404432705016 - cluster/prob_snapshot/cluster_11:0.020809068462144368 - cluster/prob_snapshot/cluster_12:0.02071226454020274 - cluster/prob_snapshot/cluster_13:0.021583460470627973 - cluster/prob_snapshot/cluster_14:0.01901724071744879 - cluster/prob_snapshot/cluster_15:0.019280916802724087 - cluster/prob_snapshot/cluster_16:0.018018840916598784 - cluster/prob_snapshot/cluster_17:0.01004428504370244 - cluster/prob_snapshot/cluster_18:0.020809068462144368 - cluster/prob_snapshot/cluster_19:0.016110379554337525 - cluster/prob_snapshot/cluster_20:0.017217413164009526 - cluster/prob_snapshot/cluster_21:0.016587198964723254 - cluster/prob_snapshot/cluster_22:0.007519187687366943 - cluster/prob_snapshot/cluster_23:0.01733399467174915 - cluster/prob_snapshot/cluster_24:0.022777638372064136 - cluster/prob_snapshot/cluster_25:0.011157210921315951 - cluster/prob_snapshot/cluster_26:0.014327748169702512 - cluster/prob_snapshot/cluster_27:0.014446293156780738 - cluster/prob_snapshot/cluster_28:0.016865323945730454 - cluster/prob_snapshot/cluster_29:0.011781131905938207 - cluster/prob_snapshot/cluster_30:0.01534163848734321 - cluster/prob_snapshot/cluster_31:0.01460702085954556 - cluster/prob_snapshot/cluster_32:0.011692303565943004 - cluster/prob_snapshot/cluster_33:0.014716145349170524 - cluster/prob_snapshot/cluster_34:0.017412565041342632 - cluster/prob_snapshot/cluster_35:0.020639118043059394 - cluster/prob_snapshot/cluster_36:0.009106594195913652 - cluster/prob_snapshot/cluster_37:0.014052748174436688 - cluster/prob_snapshot/cluster_38:0.015755597709551202 - cluster/prob_snapshot/cluster_39:0.017690771408385698 - cluster/prob_snapshot/cluster_40:0.021403703359733336 - cluster/prob_snapshot/cluster_41:0.018892330295069943 - cluster/prob_snapshot/cluster_42:0.012353811333310287 - cluster/prob_snapshot/cluster_43:0.01947340279302071 - cluster/prob_snapshot/cluster_44:0.01456634192076272 - cluster/prob_snapshot/cluster_45:0.0162975558497242 - cluster/prob_snapshot/cluster_46:0.020763904298923094 - cluster/prob_snapshot/cluster_47:0.02049205350044327 - cluster/prob_snapshot/cluster_48:0.01622892454141575 - cluster/prob_snapshot/cluster_49:0.021739895469687777 - cluster/prob_snapshot/cluster_50:0.008143108671989198 - cluster/prob_snapshot/cluster_51:0.012659532615775194 - cluster/prob_snapshot/cluster_52:0.017171045228195358 - cluster/prob_snapshot/cluster_53:0.011581040628071011 - cluster/prob_snapshot/cluster_54:0.01838590851682412 - cluster/prob_snapshot/cluster_55:0.016355642893392534 - cluster/prob_snapshot/cluster_56:0.016603277132189107 - cluster/prob_snapshot/cluster_57:0.01592320325895085 - cluster/prob_snapshot/cluster_58:0.013059572033485444 - cluster/prob_snapshot/cluster_59:0.020601557999785133 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017476766510660262 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.011968308201324883
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 19:42:40,953:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▏      | 252/800 [8:12:33<23:54:09, 157.02s/it]
[36m(TaskRunner pid=2823680)[0m step:252 - global_seqlen/min:376018 - global_seqlen/max:439392 - global_seqlen/minmax_diff:63374 - global_seqlen/balanced_min:402076 - global_seqlen/balanced_max:402578 - global_seqlen/mean:402416.75 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.22407308194254125) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012907139025628567 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03700231510447338) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006156351054396139) - actor/ppo_kl:np.float64(3.956322104516273e-05) - actor/pg_clipfrac_lower:np.float64(1.052219153168654e-05) - actor/grad_norm:np.float64(0.3612162069632457) - perf/mfu/actor:np.float64(0.2213278963759041) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(108.03802871704102) - actor/lr:np.float64(1e-06) - training/global_step:252 - training/epoch:0 - critic/score/mean:0.5824742317199707 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5790103673934937 - critic/rewards/max:1.0557132959365845 - critic/rewards/min:-0.18664458394050598 - critic/advantages/mean:-0.09296590089797974 - critic/advantages/max:2.4748153686523438 - critic/advantages/min:-2.4748423099517822 - critic/returns/mean:-0.09296590089797974 - critic/returns/max:2.4748153686523438 - critic/returns/min:-2.4748423099517822 - response_length/mean:1263.644287109375 - response_length/max:8192.0 - response_length/min:145.0 - response_length/clip_ratio:0.027061855420470238 - response_length_non_aborted/mean:1263.644287109375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:145.0 - response_length_non_aborted/clip_ratio:0.027061855420470238 - response/aborted_ratio:0.0 - prompt_length/mean:235.98968505859375 - prompt_length/max:410.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.828099817037582e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1805364191532135) - timing_s/agent_loop/generate_sequences/max:np.float64(30.10569576267153) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.086255308045111) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.10569576267153) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:216 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.53135699313134 - timing_s/reward:0.0001656739041209221 - timing_s/old_log_prob:14.94653185363859 - timing_s/ref:27.06704958807677 - timing_s/adv:0.08613271079957485 - timing_s/update_actor:21.579796346835792 - timing_s/update_weights:40.59834622498602 - timing_s/step:137.28908993955702 - timing_s/stop_profile:5.020759999752045e-05 - timing_per_token_ms/adv:7.401523292588128e-05 - timing_per_token_ms/update_actor:0.018543868389569097 - timing_per_token_ms/gen:0.03317535702367492 - timing_per_token_ms/ref:0.02325915394140561 - perf/total_num_tokens:1609667 - perf/time_per_step:137.28908993955702 - perf/throughput:2931.1633588449617 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:857.0 - frontier/mean_score:2.693351951318157 - frontier/mean_frontier_pct:0.4103395542550157 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.585666598999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.3506043929999993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.5429939899999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.8119299999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:2.024291 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:2.3903110989592995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.1775524750999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:4.486091391709999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:3.291175247509999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.391752475099999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.095489304959299 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.9717524750999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.6565509999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.4319299999999995 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.839577219299999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.735639312569999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.901163075099999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.756595861099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.5677856999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.9429999999999998 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:1.6498450069999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4270562999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.5102338129999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.4039009729999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.5019025899999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.5984876999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.1158124750999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:2.037448999999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.1815089999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.297136918798999 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.379646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.676550999999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5854463929999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.8319299999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:2.7882149929999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7382909999999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.62613 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.1538463929999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.9176456999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:2.2817089999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:252.0 - cluster/prob_snapshot/cluster_0:0.021824633660286915 - cluster/prob_snapshot/cluster_1:0.014307264309457368 - cluster/prob_snapshot/cluster_2:0.010104418059823725 - cluster/prob_snapshot/cluster_3:0.021564901185713456 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.011028551421683271 - cluster/prob_snapshot/cluster_6:0.012321114715221147 - cluster/prob_snapshot/cluster_7:0.014548944423180193 - cluster/prob_snapshot/cluster_8:0.019340593106100843 - cluster/prob_snapshot/cluster_9:0.01458846249751962 - cluster/prob_snapshot/cluster_10:0.02730518816722743 - cluster/prob_snapshot/cluster_11:0.0208888955316251 - cluster/prob_snapshot/cluster_12:0.020032173127513313 - cluster/prob_snapshot/cluster_13:0.02064434971619315 - cluster/prob_snapshot/cluster_14:0.018841104775026762 - cluster/prob_snapshot/cluster_15:0.01935488162666915 - cluster/prob_snapshot/cluster_16:0.018087964206208233 - cluster/prob_snapshot/cluster_17:0.01008281660226435 - cluster/prob_snapshot/cluster_18:0.0208888955316251 - cluster/prob_snapshot/cluster_19:0.016172181666737926 - cluster/prob_snapshot/cluster_20:0.0172834620426233 - cluster/prob_snapshot/cluster_21:0.016650830236188217 - cluster/prob_snapshot/cluster_22:0.007548032549838805 - cluster/prob_snapshot/cluster_23:0.017658312493545065 - cluster/prob_snapshot/cluster_24:0.022865017205203234 - cluster/prob_snapshot/cluster_25:0.011200011849817261 - cluster/prob_snapshot/cluster_26:0.014382711809748483 - cluster/prob_snapshot/cluster_27:0.01562916703863448 - cluster/prob_snapshot/cluster_28:0.016930022151172956 - cluster/prob_snapshot/cluster_29:0.011826326299763566 - cluster/prob_snapshot/cluster_30:0.015400491580344724 - cluster/prob_snapshot/cluster_31:0.01466305583637226 - cluster/prob_snapshot/cluster_32:0.01004199968956135 - cluster/prob_snapshot/cluster_33:0.014772598945803832 - cluster/prob_snapshot/cluster_34:0.021365502038600742 - cluster/prob_snapshot/cluster_35:0.02071829315428754 - cluster/prob_snapshot/cluster_36:0.009141528615440049 - cluster/prob_snapshot/cluster_37:0.014106656868473462 - cluster/prob_snapshot/cluster_38:0.015816038819414378 - cluster/prob_snapshot/cluster_39:0.017758636168451917 - cluster/prob_snapshot/cluster_40:0.02148581154820658 - cluster/prob_snapshot/cluster_41:0.01896480443613306 - cluster/prob_snapshot/cluster_42:0.01240120262126967 - cluster/prob_snapshot/cluster_43:0.01954810602544136 - cluster/prob_snapshot/cluster_44:0.014622220846371723 - cluster/prob_snapshot/cluster_45:0.013278042851194497 - cluster/prob_snapshot/cluster_46:0.02006845962774782 - cluster/prob_snapshot/cluster_47:0.02057066444747142 - cluster/prob_snapshot/cluster_48:0.016291181412227723 - cluster/prob_snapshot/cluster_49:0.02182329334736403 - cluster/prob_snapshot/cluster_50:0.00817434699978511 - cluster/prob_snapshot/cluster_51:0.012708096701743363 - cluster/prob_snapshot/cluster_52:0.017236916231646644 - cluster/prob_snapshot/cluster_53:0.011625467438264751 - cluster/prob_snapshot/cluster_54:0.01845643993918495 - cluster/prob_snapshot/cluster_55:0.016970839063875958 - cluster/prob_snapshot/cluster_56:0.016666970082195507 - cluster/prob_snapshot/cluster_57:0.015984287331754037 - cluster/prob_snapshot/cluster_58:0.013109670737615433 - cluster/prob_snapshot/cluster_59:0.02068058902440077 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017758636168451917 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.013887923394290904
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▏      | 253/800 [8:14:42<22:36:41, 148.81s/it]
[36m(TaskRunner pid=2823680)[0m step:253 - global_seqlen/min:307702 - global_seqlen/max:444854 - global_seqlen/minmax_diff:137152 - global_seqlen/balanced_min:393388 - global_seqlen/balanced_max:393626 - global_seqlen/mean:393493.75 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.21720320586529043) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012563218362629414 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06213021537405439) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004826014157313491) - actor/ppo_kl:np.float64(0.000129194632588931) - actor/pg_clipfrac_lower:np.float64(7.214540447522369e-06) - actor/grad_norm:np.float64(0.2434947950144609) - perf/mfu/actor:np.float64(0.23027272572401103) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(109.61682510375977) - actor/lr:np.float64(1e-06) - training/global_step:253 - training/epoch:0 - critic/score/mean:0.6541666388511658 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6474412679672241 - critic/rewards/max:1.0772522687911987 - critic/rewards/min:-0.07159368693828583 - critic/advantages/mean:-0.1264798790216446 - critic/advantages/max:2.4748213291168213 - critic/advantages/min:-2.474853277206421 - critic/returns/mean:-0.1264798790216446 - critic/returns/max:2.4748213291168213 - critic/returns/min:-2.474853277206421 - response_length/mean:1166.15966796875 - response_length/max:8192.0 - response_length/min:161.0 - response_length/clip_ratio:0.01805555634200573 - response_length_non_aborted/mean:1166.15966796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:161.0 - response_length_non_aborted/clip_ratio:0.01805555634200573 - response/aborted_ratio:0.0 - prompt_length/mean:232.8222198486328 - prompt_length/max:340.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.522998541593552e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.685516720637679) - timing_s/agent_loop/generate_sequences/max:np.float64(30.56382151134312) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.451140319692058) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.56382151134312) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:196 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.816046993248165 - timing_s/reward:0.00017862115055322647 - timing_s/old_log_prob:11.106354751624167 - timing_s/ref:27.85491813160479 - timing_s/adv:0.07042528595775366 - timing_s/update_actor:20.151752825826406 - timing_s/update_weights:36.960967645049095 - timing_s/step:129.4533686749637 - timing_s/stop_profile:4.773586988449097e-05 - timing_per_token_ms/adv:6.991719768219714e-05 - timing_per_token_ms/update_actor:0.020006366560034634 - timing_per_token_ms/gen:0.039083705411575465 - timing_per_token_ms/ref:0.027653956827340506 - perf/total_num_tokens:1573975 - perf/time_per_step:129.4533686749637 - perf/throughput:3039.656318160392 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:895.0 - frontier/mean_score:2.680164604134343 - frontier/mean_frontier_pct:0.42834626826116345 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.585666598999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.3506043929999993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.5429939899999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.8119299999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.3170037 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:2.3903110989592995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.1775524750999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.6402639741969995 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:3.291175247509999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.391752475099999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.095489304959299 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.1259299999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.9717524750999993 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.6565509999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.3023509999999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.7598999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.839577219299999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.735639312569999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:1.2401 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.930814152569999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.756595861099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.0974499899999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.9429999999999998 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.4548915048999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4270562999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.5102338129999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.4039009729999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.5019025899999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.5984876999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9423519899999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.3709999999999996 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.0810687325699995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:2.037448999999999 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.1815089999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.207995843159299 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.379646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.173585699999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5854463929999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.8823509999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.4517504950999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7382909999999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.1382909999999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.1538463929999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.9176456999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:2.2817089999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:253.0 - cluster/prob_snapshot/cluster_0:0.02193201849060435 - cluster/prob_snapshot/cluster_1:0.014377661053526135 - cluster/prob_snapshot/cluster_2:0.010154135330486784 - cluster/prob_snapshot/cluster_3:0.021671008041419994 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.011082815751682979 - cluster/prob_snapshot/cluster_6:0.014172139709076924 - cluster/prob_snapshot/cluster_7:0.014620530317931037 - cluster/prob_snapshot/cluster_8:0.019435755588150493 - cluster/prob_snapshot/cluster_9:0.014660242835016832 - cluster/prob_snapshot/cluster_10:0.028382548258043955 - cluster/prob_snapshot/cluster_11:0.020991676203094695 - cluster/prob_snapshot/cluster_12:0.020130738425133948 - cluster/prob_snapshot/cluster_13:0.02074592713672604 - cluster/prob_snapshot/cluster_14:0.018933809599801937 - cluster/prob_snapshot/cluster_15:0.019120002562272483 - cluster/prob_snapshot/cluster_16:0.01817696331598967 - cluster/prob_snapshot/cluster_17:0.01013242758620156 - cluster/prob_snapshot/cluster_18:0.020199095815172794 - cluster/prob_snapshot/cluster_19:0.016881150592500737 - cluster/prob_snapshot/cluster_20:0.017368502720402107 - cluster/prob_snapshot/cluster_21:0.01673275814422963 - cluster/prob_snapshot/cluster_22:0.007585171509750413 - cluster/prob_snapshot/cluster_23:0.017926560769653464 - cluster/prob_snapshot/cluster_24:0.022977521086414026 - cluster/prob_snapshot/cluster_25:0.011255119825088086 - cluster/prob_snapshot/cluster_26:0.014453479781904865 - cluster/prob_snapshot/cluster_27:0.01282922176217586 - cluster/prob_snapshot/cluster_28:0.01701332378107762 - cluster/prob_snapshot/cluster_29:0.011884515961168495 - cluster/prob_snapshot/cluster_30:0.015476267384919612 - cluster/prob_snapshot/cluster_31:0.014735203199184085 - cluster/prob_snapshot/cluster_32:0.008898961045678076 - cluster/prob_snapshot/cluster_33:0.014845285299024472 - cluster/prob_snapshot/cluster_34:0.021470627780767807 - cluster/prob_snapshot/cluster_35:0.020820234402396023 - cluster/prob_snapshot/cluster_36:0.009186508133286312 - cluster/prob_snapshot/cluster_37:0.014176066553774333 - cluster/prob_snapshot/cluster_38:0.01589385926173444 - cluster/prob_snapshot/cluster_39:0.017997132881384912 - cluster/prob_snapshot/cluster_40:0.020618992951672157 - cluster/prob_snapshot/cluster_41:0.01884560500755808 - cluster/prob_snapshot/cluster_42:0.012462220875227371 - cluster/prob_snapshot/cluster_43:0.01964428954362746 - cluster/prob_snapshot/cluster_44:0.014694167286751564 - cluster/prob_snapshot/cluster_45:0.013343375465739948 - cluster/prob_snapshot/cluster_46:0.019621964900354547 - cluster/prob_snapshot/cluster_47:0.020671879310712318 - cluster/prob_snapshot/cluster_48:0.013294911963261755 - cluster/prob_snapshot/cluster_49:0.021930671582873137 - cluster/prob_snapshot/cluster_50:0.008214567645830824 - cluster/prob_snapshot/cluster_51:0.012770624981906776 - cluster/prob_snapshot/cluster_52:0.017630131994436422 - cluster/prob_snapshot/cluster_53:0.011682668803824925 - cluster/prob_snapshot/cluster_54:0.018547252078439307 - cluster/prob_snapshot/cluster_55:0.02111290985743037 - cluster/prob_snapshot/cluster_56:0.016748977403923847 - cluster/prob_snapshot/cluster_57:0.013079029088586177 - cluster/prob_snapshot/cluster_58:0.01317417490247745 - cluster/prob_snapshot/cluster_59:0.020782344755003977 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.017846014869112005 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.013956256834401342
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 19:47:10,884:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▏      | 254/800 [8:17:00<22:04:03, 145.50s/it]
[36m(TaskRunner pid=2823680)[0m step:254 - global_seqlen/min:338993 - global_seqlen/max:490490 - global_seqlen/minmax_diff:151497 - global_seqlen/balanced_min:416020 - global_seqlen/balanced_max:416101 - global_seqlen/mean:416067.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.19285454787313938) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01272724848240614 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.011946445723879151) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00044557426144820056) - actor/ppo_kl:np.float64(2.050502298326743e-06) - actor/pg_clipfrac_lower:np.float64(1.8001425075908628e-05) - actor/grad_norm:np.float64(0.24099678422013918) - perf/mfu/actor:np.float64(0.21990735364568947) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(110.19261169433594) - actor/lr:np.float64(1e-06) - training/global_step:254 - training/epoch:0 - critic/score/mean:0.5768229365348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5707986950874329 - critic/rewards/max:1.1838464736938477 - critic/rewards/min:-0.10445583611726761 - critic/advantages/mean:-0.07867419719696045 - critic/advantages/max:2.4748282432556152 - critic/advantages/min:-2.474839925765991 - critic/returns/mean:-0.07867419719696045 - critic/returns/max:2.4748282432556152 - critic/returns/min:-2.474839925765991 - response_length/mean:1251.1796875 - response_length/max:8192.0 - response_length/min:138.0 - response_length/clip_ratio:0.014322916977107525 - response_length_non_aborted/mean:1251.1796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:138.0 - response_length_non_aborted/clip_ratio:0.014322916977107525 - response/aborted_ratio:0.0 - prompt_length/mean:228.8020782470703 - prompt_length/max:555.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.801836520433426e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1983553851023316) - timing_s/agent_loop/generate_sequences/max:np.float64(31.361004589125514) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.429847130529197) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.361004589125514) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.87063074298203 - timing_s/reward:0.00011639203876256943 - timing_s/old_log_prob:14.68454883620143 - timing_s/ref:27.91424883995205 - timing_s/adv:0.09605790488421917 - timing_s/update_actor:22.44653821736574 - timing_s/update_weights:39.144271505996585 - timing_s/step:137.55246652662754 - timing_s/stop_profile:4.855543375015259e-05 - timing_per_token_ms/adv:8.451144429585384e-05 - timing_per_token_ms/update_actor:0.019748394122046955 - timing_per_token_ms/gen:0.03420795659823336 - timing_per_token_ms/ref:0.024558868827522905 - perf/total_num_tokens:1664270 - perf/time_per_step:137.55246652662754 - perf/throughput:3024.791270605513 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:927.0 - frontier/mean_score:2.690506599729283 - frontier/mean_frontier_pct:0.44165680976973476 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.3506043929999993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.5429939899999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.8119299999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.3170037 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:2.3903110989592995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.1775524750999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:4.7481847819378995 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.4319299999999995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:3.291175247509999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.391752475099999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.066842513471509 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.0881509999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.9802267325699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.4595856999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.211645699999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:2.7598999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.887704053509999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.735639312569999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.951569906798999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.756595861099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.0974499899999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.7815089999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.9429999999999998 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.4548915048999997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4270562999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.3571636690999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.4039009729999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.5019025899999998 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.5984876999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.3709999999999996 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.0810687325699995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.7262142999999994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.4270562999999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.207995843159299 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.379646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:1.8215099899999991 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:3.5854463929999993 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.0878699999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.8823509999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0322909999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.4517504950999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.8168036999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.1382909999999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.1538463929999994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.9176456999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:2.2817089999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:254.0 - cluster/prob_snapshot/cluster_0:0.021123448273292395 - cluster/prob_snapshot/cluster_1:0.014561101573686999 - cluster/prob_snapshot/cluster_2:0.010283689077780853 - cluster/prob_snapshot/cluster_3:0.02194750231769544 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01122421827040748 - cluster/prob_snapshot/cluster_6:0.01435295804039987 - cluster/prob_snapshot/cluster_7:0.014807069538501354 - cluster/prob_snapshot/cluster_8:0.019683730909138837 - cluster/prob_snapshot/cluster_9:0.014847288736386202 - cluster/prob_snapshot/cluster_10:0.029413201603096226 - cluster/prob_snapshot/cluster_11:0.021259503076145075 - cluster/prob_snapshot/cluster_12:0.020387580836022694 - cluster/prob_snapshot/cluster_13:0.021010618566786363 - cluster/prob_snapshot/cluster_14:0.01899792473890786 - cluster/prob_snapshot/cluster_15:0.0191299227210638 - cluster/prob_snapshot/cluster_16:0.01846137286852637 - cluster/prob_snapshot/cluster_17:0.00904157913449498 - cluster/prob_snapshot/cluster_18:0.0198949254905077 - cluster/prob_snapshot/cluster_19:0.017096532429231596 - cluster/prob_snapshot/cluster_20:0.017888230007194913 - cluster/prob_snapshot/cluster_21:0.01694624668358775 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01828385471529156 - cluster/prob_snapshot/cluster_24:0.023270684793946644 - cluster/prob_snapshot/cluster_25:0.011398720722862807 - cluster/prob_snapshot/cluster_26:0.014637887651825885 - cluster/prob_snapshot/cluster_27:0.012992906182371276 - cluster/prob_snapshot/cluster_28:0.017230391978223684 - cluster/prob_snapshot/cluster_29:0.012036147146634658 - cluster/prob_snapshot/cluster_30:0.015673724713247853 - cluster/prob_snapshot/cluster_31:0.014923205498686858 - cluster/prob_snapshot/cluster_32:0.009012500378468933 - cluster/prob_snapshot/cluster_33:0.015034692104975125 - cluster/prob_snapshot/cluster_34:0.020796354048699697 - cluster/prob_snapshot/cluster_35:0.021085873897890316 - cluster/prob_snapshot/cluster_36:0.009303716198225271 - cluster/prob_snapshot/cluster_37:0.01435693498660066 - cluster/prob_snapshot/cluster_38:0.01609664452697903 - cluster/prob_snapshot/cluster_39:0.0183338854802648 - cluster/prob_snapshot/cluster_40:0.020882064864284833 - cluster/prob_snapshot/cluster_41:0.019086050763822783 - cluster/prob_snapshot/cluster_42:0.010693242059405528 - cluster/prob_snapshot/cluster_43:0.0198949254905077 - cluster/prob_snapshot/cluster_44:0.014881646020627503 - cluster/prob_snapshot/cluster_45:0.015034692104975125 - cluster/prob_snapshot/cluster_46:0.01987231601341104 - cluster/prob_snapshot/cluster_47:0.020935625985456032 - cluster/prob_snapshot/cluster_48:0.011283562670460637 - cluster/prob_snapshot/cluster_49:0.02221047883299973 - cluster/prob_snapshot/cluster_50:0.008319374996361475 - cluster/prob_snapshot/cluster_51:0.012933561782318118 - cluster/prob_snapshot/cluster_52:0.017855069873520098 - cluster/prob_snapshot/cluster_53:0.011831724678369635 - cluster/prob_snapshot/cluster_54:0.018783891233873366 - cluster/prob_snapshot/cluster_55:0.021382283516465588 - cluster/prob_snapshot/cluster_56:0.01744902924157743 - cluster/prob_snapshot/cluster_57:0.013245900729966325 - cluster/prob_snapshot/cluster_58:0.013342260482447913 - cluster/prob_snapshot/cluster_59:0.02104750082717925 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.018073707136873843 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.014134320777046124
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▏      | 255/800 [8:19:05<21:05:45, 139.35s/it]
[36m(TaskRunner pid=2823680)[0m step:255 - global_seqlen/min:339490 - global_seqlen/max:505020 - global_seqlen/minmax_diff:165530 - global_seqlen/balanced_min:417642 - global_seqlen/balanced_max:417801 - global_seqlen/mean:417722.5 - frontier/skipped_zero_acc_count:22.0 - actor/entropy:np.float64(0.216786033491481) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013735390268266201 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05879096319767996) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005775674171844621) - actor/ppo_kl:np.float64(-5.2899711125521536e-05) - actor/pg_clipfrac_lower:np.float64(9.322805007681347e-06) - actor/grad_norm:np.float64(0.27842586061784197) - perf/mfu/actor:np.float64(0.16600127636964848) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(109.93045806884766) - actor/lr:np.float64(1e-06) - training/global_step:255 - training/epoch:0 - critic/score/mean:0.6438679099082947 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6375735402107239 - critic/rewards/max:1.149521827697754 - critic/rewards/min:-0.18904484808444977 - critic/advantages/mean:-0.12269429862499237 - critic/advantages/max:2.4745006561279297 - critic/advantages/min:-2.474830150604248 - critic/returns/mean:-0.12269429862499237 - critic/returns/max:2.4745006561279297 - critic/returns/min:-2.474830150604248 - response_length/mean:1254.556640625 - response_length/max:8192.0 - response_length/min:132.0 - response_length/clip_ratio:0.021226415410637856 - response_length_non_aborted/mean:1254.556640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:132.0 - response_length_non_aborted/clip_ratio:0.021226415410637856 - response/aborted_ratio:0.0 - prompt_length/mean:243.71697998046875 - prompt_length/max:546.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00013407692313194275 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2171202935278416) - timing_s/agent_loop/generate_sequences/max:np.float64(30.858192120678723) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.632755997006825) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.858192120678723) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:209 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.76653527375311 - timing_s/reward:0.00015693530440330505 - timing_s/old_log_prob:13.77746482938528 - timing_s/ref:15.629693117924035 - timing_s/adv:0.10249199252575636 - timing_s/update_actor:30.10004125442356 - timing_s/update_weights:31.875779391266406 - timing_s/step:124.70268031861633 - timing_s/stop_profile:5.848146975040436e-05 - timing_per_token_ms/adv:8.066831048136877e-05 - timing_per_token_ms/update_actor:0.02369082123955839 - timing_per_token_ms/gen:0.030799552643714898 - timing_per_token_ms/ref:0.01230165309595638 - perf/total_num_tokens:1670890 - perf/time_per_step:124.70268031861633 - perf/throughput:3349.747567034772 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:949.0 - frontier/mean_score:2.7477767909702973 - frontier/mean_frontier_pct:0.45360239036838224 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.3506043929999993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.6601 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.5429939899999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:1.8119299999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.3170037 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:2.5732177692715092 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.1775524750999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.8237293473565295 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.3023509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:3.291175247509999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.391752475099999 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.066842513471509 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.0881509999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.9802267325699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.4595856999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.211645699999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.4319299999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.887704053509999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.735639312569999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.951569906798999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.756595861099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.0974499899999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.8470562999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.9429999999999998 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9184240534299997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4270562999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.3571636690999997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.4039009729999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.9513318129999997 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.5984876999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.0810687325699995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:1.5083500099999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.4270562999999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.207995843159299 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:3.379646393 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:2.175056992999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:4.0098124750999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.361509 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.9176456999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.0226036999999994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.4517504950999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.8717625899999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.1382909999999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.4076924750999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.9176456999999996 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:2.2817089999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:255.0 - cluster/prob_snapshot/cluster_0:0.020683185466554756 - cluster/prob_snapshot/cluster_1:0.014257613650452009 - cluster/prob_snapshot/cluster_2:0.010069352585063166 - cluster/prob_snapshot/cluster_3:0.021490064268459584 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.010990278916603518 - cluster/prob_snapshot/cluster_6:0.014053808322508235 - cluster/prob_snapshot/cluster_7:0.015607877234470541 - cluster/prob_snapshot/cluster_8:0.01927347522999942 - cluster/prob_snapshot/cluster_9:0.014537835993303992 - cluster/prob_snapshot/cluster_10:0.02925837694924976 - cluster/prob_snapshot/cluster_11:0.020030441888221145 - cluster/prob_snapshot/cluster_12:0.0199626552535454 - cluster/prob_snapshot/cluster_13:0.020572707398977513 - cluster/prob_snapshot/cluster_14:0.018601962888383807 - cluster/prob_snapshot/cluster_15:0.018731209719243056 - cluster/prob_snapshot/cluster_16:0.018076594032695667 - cluster/prob_snapshot/cluster_17:0.008853131161626546 - cluster/prob_snapshot/cluster_18:0.01948026801494006 - cluster/prob_snapshot/cluster_19:0.020816404564336984 - cluster/prob_snapshot/cluster_20:0.017515396829171914 - cluster/prob_snapshot/cluster_21:0.016593046674192605 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01790277577918267 - cluster/prob_snapshot/cluster_24:0.022785668480817343 - cluster/prob_snapshot/cluster_25:0.011161144323700217 - cluster/prob_snapshot/cluster_26:0.014332799324440851 - cluster/prob_snapshot/cluster_27:0.012722103173813148 - cluster/prob_snapshot/cluster_28:0.017268847487636508 - cluster/prob_snapshot/cluster_29:0.011785285267621065 - cluster/prob_snapshot/cluster_30:0.015347047082540543 - cluster/prob_snapshot/cluster_31:0.014612170470060333 - cluster/prob_snapshot/cluster_32:0.011636219626319333 - cluster/prob_snapshot/cluster_33:0.01472133343081672 - cluster/prob_snapshot/cluster_34:0.020362908662087963 - cluster/prob_snapshot/cluster_35:0.020646394230333456 - cluster/prob_snapshot/cluster_36:0.011835821959850336 - cluster/prob_snapshot/cluster_37:0.014057702379709372 - cluster/prob_snapshot/cluster_38:0.015761152243388855 - cluster/prob_snapshot/cluster_39:0.017951763784246382 - cluster/prob_snapshot/cluster_40:0.02341104763120794 - cluster/prob_snapshot/cluster_41:0.0186882521577381 - cluster/prob_snapshot/cluster_42:0.009148911554950635 - cluster/prob_snapshot/cluster_43:0.01948026801494006 - cluster/prob_snapshot/cluster_44:0.014571477190181325 - cluster/prob_snapshot/cluster_45:0.01472133343081672 - cluster/prob_snapshot/cluster_46:0.01945812977301848 - cluster/prob_snapshot/cluster_47:0.020499277841066175 - cluster/prob_snapshot/cluster_48:0.013192829200123041 - cluster/prob_snapshot/cluster_49:0.024321556298877604 - cluster/prob_snapshot/cluster_50:0.008145979472164227 - cluster/prob_snapshot/cluster_51:0.014323755649539143 - cluster/prob_snapshot/cluster_52:0.01769700817516621 - cluster/prob_snapshot/cluster_53:0.01158512344887094 - cluster/prob_snapshot/cluster_54:0.018333631937965476 - cluster/prob_snapshot/cluster_55:0.020936625968814074 - cluster/prob_snapshot/cluster_56:0.017418703728271905 - cluster/prob_snapshot/cluster_57:0.012969824714455335 - cluster/prob_snapshot/cluster_58:0.014603881963848752 - cluster/prob_snapshot/cluster_59:0.02060882094550942 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.01769700817516621 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.013839727978743383
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 19:51:31,392:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▏      | 256/800 [8:21:27<21:10:32, 140.13s/it]
[36m(TaskRunner pid=2823680)[0m step:256 - global_seqlen/min:371439 - global_seqlen/max:459535 - global_seqlen/minmax_diff:88096 - global_seqlen/balanced_min:424499 - global_seqlen/balanced_max:424607 - global_seqlen/mean:424571.5 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.22697363413870333) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012170986272394657 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.024579912053013686) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007773612233722816) - actor/ppo_kl:np.float64(0.0008654310635483853) - actor/pg_clipfrac_lower:np.float64(3.269625278335297e-05) - actor/grad_norm:np.float64(0.28437176919900453) - perf/mfu/actor:np.float64(0.19261772351488535) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(108.33395385742188) - actor/lr:np.float64(1e-06) - training/global_step:256 - training/epoch:0 - critic/score/mean:0.6175000071525574 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6154109835624695 - critic/rewards/max:1.449636459350586 - critic/rewards/min:-0.05162172019481659 - critic/advantages/mean:-0.10855420678853989 - critic/advantages/max:2.4748482704162598 - critic/advantages/min:-2.4748058319091797 - critic/returns/mean:-0.10855420678853989 - critic/returns/max:2.4748482704162598 - critic/returns/min:-2.4748058319091797 - response_length/mean:1298.052490234375 - response_length/max:8192.0 - response_length/min:178.0 - response_length/clip_ratio:0.026249999180436134 - response_length_non_aborted/mean:1298.052490234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:178.0 - response_length_non_aborted/clip_ratio:0.026249999180436134 - response/aborted_ratio:0.0 - prompt_length/mean:244.1699981689453 - prompt_length/max:478.0 - prompt_length/min:187.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.848643094301224e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4545887429267168) - timing_s/agent_loop/generate_sequences/max:np.float64(30.516823517158628) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.650330529478197) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.516823517158628) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:217 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.0943157710135 - timing_s/reward:0.00016259867697954178 - timing_s/old_log_prob:14.524620783515275 - timing_s/ref:27.799599672667682 - timing_s/adv:0.10199126321822405 - timing_s/update_actor:26.0706390067935 - timing_s/update_weights:40.700945121236145 - timing_s/step:141.71624235715717 - timing_s/stop_profile:5.604792386293411e-05 - timing_per_token_ms/adv:8.26658144481617e-05 - timing_per_token_ms/update_actor:0.021130737463946916 - timing_per_token_ms/gen:0.030906218903909412 - timing_per_token_ms/ref:0.02253209221810381 - perf/total_num_tokens:1698286 - perf/time_per_step:141.71624235715717 - perf/throughput:2995.9268813378726 - frontier/active_count:57.0 - frontier/completed_count:7.0 - frontier/blacklisted_count:975.0 - frontier/mean_score:2.761510794396731 - frontier/mean_frontier_pct:0.4494260374622757 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:3.145423075099999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:2.06207 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.5429939899999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.5683509999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.3170037 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.701252438490056 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:3.1775524750999993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.8237293473565295 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.3023509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:3.203822673256999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.066842513471509 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:3.6617056999999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.9802267325699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.4595856999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.211645699999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.4319299999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.887704053509999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.735639312569999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:3.756595861099999 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.0974499899999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.8470562999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.530215586999999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9184240534299997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4270562999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2500145683699997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.2827306810999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.9513318129999997 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.3176456999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.7189413899999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.8596999999999997 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.0810687325699995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:1.5083500099999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:2.4023500099999993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.4270562999999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.207995843159299 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.2657524750999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:2.175056992999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:4.0098124750999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.361509 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.9176456999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.0158225899999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.4517504950999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.9102338129999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.1382909999999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.4076924750999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:2.2817089999999998 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:256.0 - cluster/prob_snapshot/cluster_0:0.02166349517755982 - cluster/prob_snapshot/cluster_1:0.0199828518065676 - cluster/prob_snapshot/cluster_2:0.013100316949718705 - cluster/prob_snapshot/cluster_3:0.022508617175919586 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.009963723437326705 - cluster/prob_snapshot/cluster_6:0.014719909044635224 - cluster/prob_snapshot/cluster_7:0.017161038716154285 - cluster/prob_snapshot/cluster_8:0.020186969670366744 - cluster/prob_snapshot/cluster_9:0.015226877912126409 - cluster/prob_snapshot/cluster_10:0.030645120354803813 - cluster/prob_snapshot/cluster_11:0.02097981386627055 - cluster/prob_snapshot/cluster_12:0.020353864063956006 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.019483630022913253 - cluster/prob_snapshot/cluster_15:0.023262791877381266 - cluster/prob_snapshot/cluster_16:0.018933360544836773 - cluster/prob_snapshot/cluster_17:0.009272738212222202 - cluster/prob_snapshot/cluster_18:0.020403563700650895 - cluster/prob_snapshot/cluster_19:0.02180302838858434 - cluster/prob_snapshot/cluster_20:0.01834556458217638 - cluster/prob_snapshot/cluster_21:0.017379498297719082 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.015031810743446448 - cluster/prob_snapshot/cluster_24:0.023865628437643463 - cluster/prob_snapshot/cluster_25:0.011690143020934008 - cluster/prob_snapshot/cluster_26:0.015012123231599945 - cluster/prob_snapshot/cluster_27:0.013325085790096516 - cluster/prob_snapshot/cluster_28:0.018087329675371553 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016074442739974177 - cluster/prob_snapshot/cluster_31:0.015304735579715446 - cluster/prob_snapshot/cluster_32:0.012187735209715039 - cluster/prob_snapshot/cluster_33:0.015419072478049514 - cluster/prob_snapshot/cluster_34:0.02064732086536841 - cluster/prob_snapshot/cluster_35:0.02085516611122608 - cluster/prob_snapshot/cluster_36:0.012396797986668363 - cluster/prob_snapshot/cluster_37:0.014723987666351128 - cluster/prob_snapshot/cluster_38:0.017273416507057826 - cluster/prob_snapshot/cluster_39:0.018802613784882048 - cluster/prob_snapshot/cluster_40:0.024520648344056838 - cluster/prob_snapshot/cluster_41:0.019574009097913795 - cluster/prob_snapshot/cluster_42:0.009582537548245876 - cluster/prob_snapshot/cluster_43:0.020403563700650895 - cluster/prob_snapshot/cluster_44:0.015262113582545643 - cluster/prob_snapshot/cluster_45:0.015419072478049514 - cluster/prob_snapshot/cluster_46:0.02038037618449758 - cluster/prob_snapshot/cluster_47:0.020747303681804375 - cluster/prob_snapshot/cluster_48:0.013818122562280665 - cluster/prob_snapshot/cluster_49:0.025474311896660173 - cluster/prob_snapshot/cluster_50:0.008532070038103566 - cluster/prob_snapshot/cluster_51:0.015002650918549453 - cluster/prob_snapshot/cluster_52:0.01853578366252547 - cluster/prob_snapshot/cluster_53:0.012134217254488319 - cluster/prob_snapshot/cluster_54:0.019159500789556885 - cluster/prob_snapshot/cluster_55:0.021928947861691632 - cluster/prob_snapshot/cluster_56:0.01848869599387431 - cluster/prob_snapshot/cluster_57:0.013584548454092711 - cluster/prob_snapshot/cluster_58:0.015296054227675444 - cluster/prob_snapshot/cluster_59:0.021585606041688934 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.014495682051058265
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▏      | 257/800 [8:23:34<20:33:32, 136.30s/it]
[36m(TaskRunner pid=2823680)[0m step:257 - global_seqlen/min:296238 - global_seqlen/max:452887 - global_seqlen/minmax_diff:156649 - global_seqlen/balanced_min:388204 - global_seqlen/balanced_max:388362 - global_seqlen/mean:388294.5 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.2024631679286154) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013580509461462498 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.0668338526156731) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005510990814233379) - actor/ppo_kl:np.float64(-3.971127747653165e-05) - actor/pg_clipfrac_lower:np.float64(2.014891481162429e-05) - actor/grad_norm:np.float64(0.31986508002647984) - perf/mfu/actor:np.float64(0.1989664867759664) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(109.6678695678711) - actor/lr:np.float64(1e-06) - training/global_step:257 - training/epoch:0 - critic/score/mean:0.655927836894989 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.650292694568634 - critic/rewards/max:1.1307432651519775 - critic/rewards/min:-0.1414700448513031 - critic/advantages/mean:-0.12042548507452011 - critic/advantages/max:2.474717855453491 - critic/advantages/min:-2.4748475551605225 - critic/returns/mean:-0.12042548507452011 - critic/returns/max:2.474717855453491 - critic/returns/min:-2.4748475551605225 - response_length/mean:1163.0631103515625 - response_length/max:8192.0 - response_length/min:162.0 - response_length/clip_ratio:0.018041236326098442 - response_length_non_aborted/mean:1163.0631103515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:162.0 - response_length_non_aborted/clip_ratio:0.018041236326098442 - response/aborted_ratio:0.0 - prompt_length/mean:238.50515747070312 - prompt_length/max:505.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.839221507310867e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.652670600451529) - timing_s/agent_loop/generate_sequences/max:np.float64(29.58946079388261) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.102435213126228) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.58946079388261) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:216 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.32855381630361 - timing_s/reward:0.0001333095133304596 - timing_s/old_log_prob:12.684307157061994 - timing_s/ref:24.141048731282353 - timing_s/adv:0.08634395804256201 - timing_s/update_actor:23.029680625535548 - timing_s/update_weights:35.23578101862222 - timing_s/step:126.9234308693558 - timing_s/stop_profile:5.13540580868721e-05 - timing_per_token_ms/adv:7.938820195212286e-05 - timing_per_token_ms/update_actor:0.02117443973892974 - timing_per_token_ms/gen:0.03471165593909569 - timing_per_token_ms/ref:0.02219627748672773 - perf/total_num_tokens:1553178 - perf/time_per_step:126.9234308693558 - perf/throughput:3059.2814686807305 - frontier/active_count:56.0 - frontier/completed_count:8.0 - frontier/blacklisted_count:1006.0 - frontier/mean_score:2.756984511636218 - frontier/mean_frontier_pct:0.45362784558857633 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:3.101796152569999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:2.06207 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.5429939899999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.5683509999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.3170037 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.701252438490056 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:3.1242867325699994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.3968036999999995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.8237293473565295 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.3023509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:3.203822673256999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:3.046789759430056 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:3.6617056999999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.4595856999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.211645699999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:3.521392837456999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.735639312569999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.8400999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.0974499899999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.8470562999999993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.671150910899999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9184240534299997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:2.4270562999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2500145683699997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.1979114767699994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.9513318129999997 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:1.9223519899999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.7189413899999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:4.201789999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.0810687325699995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:1.5083500099999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.581645006999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.4270562999999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.207995843159299 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.2657524750999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:2.175056992999999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:3.7068687325699994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.361509 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.9176456999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.0110758129999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.4517504950999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.9102338129999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.3968036999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.4076924750999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:2.4971962999999997 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:257.0 - cluster/prob_snapshot/cluster_0:0.022086544484354088 - cluster/prob_snapshot/cluster_1:0.02009050713792589 - cluster/prob_snapshot/cluster_2:0.01335614270447062 - cluster/prob_snapshot/cluster_3:0.022948170203495398 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.010158297131862254 - cluster/prob_snapshot/cluster_6:0.015007362535697837 - cluster/prob_snapshot/cluster_7:0.017496163102742605 - cluster/prob_snapshot/cluster_8:0.020236179882297475 - cluster/prob_snapshot/cluster_9:0.015524231598249907 - cluster/prob_snapshot/cluster_10:0.031243564733998724 - cluster/prob_snapshot/cluster_11:0.021389512100099053 - cluster/prob_snapshot/cluster_12:0.0207513386179126 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.01973422765350718 - cluster/prob_snapshot/cluster_15:0.023717072587726644 - cluster/prob_snapshot/cluster_16:0.019341516973886308 - cluster/prob_snapshot/cluster_17:0.009453818201421214 - cluster/prob_snapshot/cluster_18:0.020802008799604005 - cluster/prob_snapshot/cluster_19:0.021389512100099053 - cluster/prob_snapshot/cluster_20:0.022808258330501097 - cluster/prob_snapshot/cluster_21:0.017718888809131028 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01532535511672306 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01191843060152972 - cluster/prob_snapshot/cluster_26:0.015305283142989363 - cluster/prob_snapshot/cluster_27:0.013585300878209989 - cluster/prob_snapshot/cluster_28:0.018440542867343066 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01730119382521305 - cluster/prob_snapshot/cluster_31:0.01560360968680667 - cluster/prob_snapshot/cluster_32:0.012425739875610454 - cluster/prob_snapshot/cluster_33:0.015720179380226885 - cluster/prob_snapshot/cluster_34:0.021050526105689044 - cluster/prob_snapshot/cluster_35:0.020713051467702116 - cluster/prob_snapshot/cluster_36:0.012638885274603374 - cluster/prob_snapshot/cluster_37:0.012451181340431255 - cluster/prob_snapshot/cluster_38:0.01761073543086884 - cluster/prob_snapshot/cluster_39:0.019169795195934054 - cluster/prob_snapshot/cluster_40:0.027215228801261645 - cluster/prob_snapshot/cluster_41:0.01995625447947322 - cluster/prob_snapshot/cluster_42:0.00976966736427458 - cluster/prob_snapshot/cluster_43:0.020802008799604005 - cluster/prob_snapshot/cluster_44:0.016721459080329983 - cluster/prob_snapshot/cluster_45:0.015720179380226885 - cluster/prob_snapshot/cluster_46:0.020778368472740574 - cluster/prob_snapshot/cluster_47:0.021152461407669836 - cluster/prob_snapshot/cluster_48:0.01408796577655693 - cluster/prob_snapshot/cluster_49:0.024009596075276328 - cluster/prob_snapshot/cluster_50:0.008698686102850071 - cluster/prob_snapshot/cluster_51:0.015295625852610103 - cluster/prob_snapshot/cluster_52:0.018897754358622683 - cluster/prob_snapshot/cluster_53:0.012371176810456913 - cluster/prob_snapshot/cluster_54:0.019502906425294914 - cluster/prob_snapshot/cluster_55:0.022357180984536276 - cluster/prob_snapshot/cluster_56:0.018849747152038323 - cluster/prob_snapshot/cluster_57:0.015524231598249907 - cluster/prob_snapshot/cluster_58:0.015594758803491477 - cluster/prob_snapshot/cluster_59:0.022007134312996227 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.01617448008257529
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 19:56:00,334:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▏      | 258/800 [8:26:01<20:58:26, 139.31s/it]
[36m(TaskRunner pid=2823680)[0m step:258 - global_seqlen/min:359807 - global_seqlen/max:496622 - global_seqlen/minmax_diff:136815 - global_seqlen/balanced_min:423184 - global_seqlen/balanced_max:423408 - global_seqlen/mean:423253.5 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.2054724768631988) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009627231396734715 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.056229250112664886) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0021387481570450794) - actor/ppo_kl:np.float64(0.0002376222527851092) - actor/pg_clipfrac_lower:np.float64(0.00019292241344778127) - actor/grad_norm:np.float64(0.3737381622195244) - perf/mfu/actor:np.float64(0.23307731542426016) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.63091278076172) - actor/lr:np.float64(1e-06) - training/global_step:258 - training/epoch:0 - critic/score/mean:0.6138888597488403 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6230305433273315 - critic/rewards/max:1.3570656776428223 - critic/rewards/min:-0.0680650994181633 - critic/advantages/mean:-0.0011827187845483422 - critic/advantages/max:2.4748222827911377 - critic/advantages/min:-2.4748647212982178 - critic/returns/mean:-0.0011827187845483422 - critic/returns/max:2.4748222827911377 - critic/returns/min:-2.4748647212982178 - response_length/mean:1365.5028076171875 - response_length/max:8192.0 - response_length/min:113.0 - response_length/clip_ratio:0.03750000149011612 - response_length_non_aborted/mean:1365.5028076171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:113.0 - response_length_non_aborted/clip_ratio:0.03750000149011612 - response/aborted_ratio:0.0 - prompt_length/mean:237.022216796875 - prompt_length/max:355.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.659212082624435e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9640906816348433) - timing_s/agent_loop/generate_sequences/max:np.float64(30.651973146013916) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.472057029203825) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.651973146013916) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:213 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.40715507045388 - timing_s/reward:0.00017044134438037872 - timing_s/old_log_prob:13.02091489546001 - timing_s/ref:35.9942429875955 - timing_s/adv:0.07363669574260712 - timing_s/update_actor:21.58821214362979 - timing_s/update_weights:41.60496131423861 - timing_s/step:146.07206094171852 - timing_s/stop_profile:7.558520883321762e-05 - timing_per_token_ms/adv:6.382002685224803e-05 - timing_per_token_ms/update_actor:0.018710240387677945 - timing_per_token_ms/gen:0.033979298498572855 - timing_per_token_ms/ref:0.031195771766080526 - perf/total_num_tokens:1693014 - perf/time_per_step:146.07206094171852 - perf/throughput:2897.566429002973 - frontier/active_count:56.0 - frontier/completed_count:8.0 - frontier/blacklisted_count:1044.0 - frontier/mean_score:2.7472363490539506 - frontier/mean_frontier_pct:0.4675284726119981 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:3.101796152569999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:2.06207 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.3800957929999993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.9978456999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.92190259 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.701252438490056 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:3.0870007127989996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.5777625899999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.876610543149571 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.3023509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:3.203822673256999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:3.032752831601039 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.863193989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.4595856999999997 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.211645699999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:3.521392837456999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.735639312569999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:2.1880699999999997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.0974499899999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.8929394099999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.671150910899999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9184240534299997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.5989394099999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.2500145683699997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.1979114767699994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.9513318129999997 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.8032589729999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:4.201789999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.0810687325699995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:1.5083500099999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.581645006999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.4270562999999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.207995843159299 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.2657524750999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.422539895099999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:3.7068687325699994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.361509 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.9176456999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.0110758129999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.4517504950999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.9371636690999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.3968036999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.4076924750999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:2.0480374099999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:258.0 - cluster/prob_snapshot/cluster_0:0.022164915326596365 - cluster/prob_snapshot/cluster_1:0.02016179533633959 - cluster/prob_snapshot/cluster_2:0.013403535004954696 - cluster/prob_snapshot/cluster_3:0.021970753796706998 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.012986074563156544 - cluster/prob_snapshot/cluster_6:0.012492441401687671 - cluster/prob_snapshot/cluster_7:0.017558245654376768 - cluster/prob_snapshot/cluster_8:0.02006562440378915 - cluster/prob_snapshot/cluster_9:0.016755556847986575 - cluster/prob_snapshot/cluster_10:0.031698157735012104 - cluster/prob_snapshot/cluster_11:0.02146540962583576 - cluster/prob_snapshot/cluster_12:0.020824971679267786 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.019713011071273042 - cluster/prob_snapshot/cluster_15:0.018610872022259622 - cluster/prob_snapshot/cluster_16:0.01941014749128393 - cluster/prob_snapshot/cluster_17:0.009487363679545942 - cluster/prob_snapshot/cluster_18:0.02087582165661797 - cluster/prob_snapshot/cluster_19:0.02146540962583576 - cluster/prob_snapshot/cluster_20:0.022889190067772496 - cluster/prob_snapshot/cluster_21:0.01778176167005106 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.015379734876717222 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014222539893549309 - cluster/prob_snapshot/cluster_26:0.01535959168054816 - cluster/prob_snapshot/cluster_27:0.013633506312640634 - cluster/prob_snapshot/cluster_28:0.018804218406333433 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017362584557151194 - cluster/prob_snapshot/cluster_31:0.01565897679206123 - cluster/prob_snapshot/cluster_32:0.01246983077902112 - cluster/prob_snapshot/cluster_33:0.01689320699961267 - cluster/prob_snapshot/cluster_34:0.021125220789672525 - cluster/prob_snapshot/cluster_35:0.0207865486727575 - cluster/prob_snapshot/cluster_36:0.01268373249299355 - cluster/prob_snapshot/cluster_37:0.01069676540289754 - cluster/prob_snapshot/cluster_38:0.01822129208637866 - cluster/prob_snapshot/cluster_39:0.019237816383955636 - cluster/prob_snapshot/cluster_40:0.027311798022602814 - cluster/prob_snapshot/cluster_41:0.020027066302149486 - cluster/prob_snapshot/cluster_42:0.00980433358652168 - cluster/prob_snapshot/cluster_43:0.02087582165661797 - cluster/prob_snapshot/cluster_44:0.016780792709118024 - cluster/prob_snapshot/cluster_45:0.015775960115828182 - cluster/prob_snapshot/cluster_46:0.020852097445544927 - cluster/prob_snapshot/cluster_47:0.02122751779402265 - cluster/prob_snapshot/cluster_48:0.01574660330874903 - cluster/prob_snapshot/cluster_49:0.024094790582169394 - cluster/prob_snapshot/cluster_50:0.008729552106210826 - cluster/prob_snapshot/cluster_51:0.015349900122699793 - cluster/prob_snapshot/cluster_52:0.018964810249897213 - cluster/prob_snapshot/cluster_53:0.01241507410488658 - cluster/prob_snapshot/cluster_54:0.0195721096093333 - cluster/prob_snapshot/cluster_55:0.02243651214044264 - cluster/prob_snapshot/cluster_56:0.019091677806312598 - cluster/prob_snapshot/cluster_57:0.015579317042076617 - cluster/prob_snapshot/cluster_58:0.01565009450269334 - cluster/prob_snapshot/cluster_59:0.02208522337996957 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.013312322625513077
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 19:58:21,780:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▏      | 259/800 [8:28:32<21:28:18, 142.88s/it]
[36m(TaskRunner pid=2823680)[0m step:259 - global_seqlen/min:366260 - global_seqlen/max:520380 - global_seqlen/minmax_diff:154120 - global_seqlen/balanced_min:451063 - global_seqlen/balanced_max:451341 - global_seqlen/mean:451191.5 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.2099695699289441) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012083668261766434 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07138247953844257) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008379923070937035) - actor/ppo_kl:np.float64(0.00035042533605249157) - actor/pg_clipfrac_lower:np.float64(2.2888115836394718e-05) - actor/grad_norm:np.float64(0.30082791908220813) - perf/mfu/actor:np.float64(0.261241951686294) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(112.59805679321289) - actor/lr:np.float64(1e-06) - training/global_step:259 - training/epoch:0 - critic/score/mean:0.59375 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5904436707496643 - critic/rewards/max:1.065982699394226 - critic/rewards/min:-0.052345674484968185 - critic/advantages/mean:-0.10177864879369736 - critic/advantages/max:2.4748079776763916 - critic/advantages/min:-2.4748597145080566 - critic/returns/mean:-0.10177864879369736 - critic/returns/max:2.4748079776763916 - critic/returns/min:-2.4748597145080566 - response_length/mean:1151.4317626953125 - response_length/max:8192.0 - response_length/min:119.0 - response_length/clip_ratio:0.018465908244252205 - response_length_non_aborted/mean:1151.4317626953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:119.0 - response_length_non_aborted/clip_ratio:0.018465908244252205 - response/aborted_ratio:0.0 - prompt_length/mean:231.76136779785156 - prompt_length/max:544.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.70097428560257e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0711181545630097) - timing_s/agent_loop/generate_sequences/max:np.float64(33.28911003097892) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.975404931800767) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.28911003097892) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:203 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.49154280219227 - timing_s/reward:0.000153224915266037 - timing_s/old_log_prob:12.783624870702624 - timing_s/ref:24.900763476267457 - timing_s/adv:0.12964288890361786 - timing_s/update_actor:20.83594640996307 - timing_s/update_weights:55.32121809478849 - timing_s/step:150.95906731765717 - timing_s/stop_profile:7.480569183826447e-05 - timing_per_token_ms/adv:0.0001331352939340971 - timing_per_token_ms/update_actor:0.021397238777576457 - timing_per_token_ms/gen:0.04501749649916146 - timing_per_token_ms/ref:0.02557155654762475 - perf/total_num_tokens:1804766 - perf/time_per_step:150.95906731765717 - perf/throughput:2988.8333838905855 - frontier/active_count:55.0 - frontier/completed_count:9.0 - frontier/blacklisted_count:1084.0 - frontier/mean_score:2.665872252548937 - frontier/mean_frontier_pct:0.47372924764441976 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:3.101796152569999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.7434489999999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:3.2660670550999993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.9978456999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6453318129999999 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.701252438490056 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:3.0870007127989996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.5777625899999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.3023509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:3.203822673256999 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.022926982120727 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.863193989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.1481519899999992 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:3.521392837456999 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.2149475187989993 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:2.1880699999999997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:2.3629999999999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.0974499899999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.8929394099999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:2.7698056376299993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9184240534299997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.5989394099999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.1750101978589997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.1979114767699994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:1.9513318129999997 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.2622812810999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.0810687325699995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:1.3558450069999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.581645006999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.4270562999999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.1455970902115094 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.2657524750999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.422539895099999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:3.7068687325699994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.9176456999999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.407753069099999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.4517504950999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.9371636690999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.3968036999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.4076924750999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:2.0480374099999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:259.0 - cluster/prob_snapshot/cluster_0:0.023256700698580712 - cluster/prob_snapshot/cluster_1:0.02115491229152798 - cluster/prob_snapshot/cluster_2:0.01189069457359185 - cluster/prob_snapshot/cluster_3:0.022275274934377018 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013625734405688842 - cluster/prob_snapshot/cluster_6:0.011221514400821097 - cluster/prob_snapshot/cluster_7:0.018423118606999953 - cluster/prob_snapshot/cluster_8:0.021054004232044204 - cluster/prob_snapshot/cluster_9:0.01758089146337006 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.022522739188697582 - cluster/prob_snapshot/cluster_12:0.021850754955061797 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02061700770293098 - cluster/prob_snapshot/cluster_15:0.01952759457835233 - cluster/prob_snapshot/cluster_16:0.020366240252604514 - cluster/prob_snapshot/cluster_17:0.013106472601211876 - cluster/prob_snapshot/cluster_18:0.02147106900421829 - cluster/prob_snapshot/cluster_19:0.022522739188697582 - cluster/prob_snapshot/cluster_20:0.02401665130659695 - cluster/prob_snapshot/cluster_21:0.015106415181960582 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.016137300124135466 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01492310476282307 - cluster/prob_snapshot/cluster_26:0.016116164727157226 - cluster/prob_snapshot/cluster_27:0.014305056938558727 - cluster/prob_snapshot/cluster_28:0.01973046469625266 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.018890666067817957 - cluster/prob_snapshot/cluster_31:0.016430296761025903 - cluster/prob_snapshot/cluster_32:0.013084061811941834 - cluster/prob_snapshot/cluster_33:0.017725321899052394 - cluster/prob_snapshot/cluster_34:0.021654247718620272 - cluster/prob_snapshot/cluster_35:0.02181043933241301 - cluster/prob_snapshot/cluster_36:0.01330849976117239 - cluster/prob_snapshot/cluster_37:0.0112236599036141 - cluster/prob_snapshot/cluster_38:0.015429241551151874 - cluster/prob_snapshot/cluster_39:0.020185420568652013 - cluster/prob_snapshot/cluster_40:0.02619816593596567 - cluster/prob_snapshot/cluster_41:0.021013546859835655 - cluster/prob_snapshot/cluster_42:0.009247152550700652 - cluster/prob_snapshot/cluster_43:0.021904109668415647 - cluster/prob_snapshot/cluster_44:0.017607370376578488 - cluster/prob_snapshot/cluster_45:0.016553042375321507 - cluster/prob_snapshot/cluster_46:0.021453644041944622 - cluster/prob_snapshot/cluster_47:0.022273129431584018 - cluster/prob_snapshot/cluster_48:0.016522239529217848 - cluster/prob_snapshot/cluster_49:0.02528163652816193 - cluster/prob_snapshot/cluster_50:0.009159546859319575 - cluster/prob_snapshot/cluster_51:0.017412389288407337 - cluster/prob_snapshot/cluster_52:0.019898966871215385 - cluster/prob_snapshot/cluster_53:0.0130266079682058 - cluster/prob_snapshot/cluster_54:0.016421390217492157 - cluster/prob_snapshot/cluster_55:0.02354167565640208 - cluster/prob_snapshot/cluster_56:0.020032083589435937 - cluster/prob_snapshot/cluster_57:0.0163467131814896 - cluster/prob_snapshot/cluster_58:0.016420976953469526 - cluster/prob_snapshot/cluster_59:0.023173083336480407 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.01396805258863328
[36m(TaskRunner pid=2823680)[0m step:260 - global_seqlen/min:419478 - global_seqlen/max:438150 - global_seqlen/minmax_diff:18672 - global_seqlen/balanced_min:429180 - global_seqlen/balanced_max:429298 - global_seqlen/mean:429232.5 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.20014340267516673) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01182045228779316 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0010203860438195989) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008154391595477742) - actor/ppo_kl:np.float64(6.187298105804946e-05) - actor/pg_clipfrac_lower:np.float64(1.1146993055414592e-05) - actor/grad_norm:np.float64(0.33710253859559697) - perf/mfu/actor:np.float64(0.1934160904426028) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(109.93026351928711) - actor/lr:np.float64(1e-06) - training/global_step:260 - training/epoch:0 - critic/score/mean:0.5907894968986511 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5898209810256958 - critic/rewards/max:1.1119327545166016 - critic/rewards/min:-0.3733958601951599 - critic/advantages/mean:-0.16062723100185394 - critic/advantages/max:2.474766254425049 - critic/advantages/min:-2.474839925765991 - critic/returns/mean:-0.16062723100185394 - critic/returns/max:2.474766254425049 - critic/returns/min:-2.474839925765991 - response_length/mean:1319.0382080078125 - response_length/max:8192.0 - response_length/min:179.0 - response_length/clip_ratio:0.028947368264198303 - response_length_non_aborted/mean:1319.0382080078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:179.0 - response_length_non_aborted/clip_ratio:0.028947368264198303 - response/aborted_ratio:0.0 - prompt_length/mean:238.1052703857422 - prompt_length/max:549.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.18954610824585e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.87633709423244) - timing_s/agent_loop/generate_sequences/max:np.float64(31.68530136719346) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.005829139461639) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.68530136719346) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:210 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.90354110393673 - timing_s/reward:0.00033674854785203934 - timing_s/old_log_prob:16.92505580279976 - timing_s/ref:56.80374675896019 - timing_s/adv:0.080887196585536 - timing_s
[36m(TaskRunner pid=2823680)[0m Training Progress:  32%|███▎      | 260/800 [8:31:31<23:02:42, 153.63s/it]
[36m(TaskRunner pid=2823680)[0m /update_actor:26.28803360555321 - timing_s/update_weights:43.03501334134489 - timing_s/step:178.47778829652816 - timing_s/stop_profile:6.895884871482849e-05 - timing_per_token_ms/adv:6.834985164765778e-05 - timing_per_token_ms/update_actor:0.02221344381923479 - timing_per_token_ms/gen:0.03481757650753962 - timing_per_token_ms/ref:0.04799928576953935 - perf/total_num_tokens:1716930 - perf/time_per_step:178.47778829652816 - perf/throughput:2404.963127887156 - frontier/active_count:55.0 - frontier/completed_count:9.0 - frontier/blacklisted_count:1117.0 - frontier/mean_score:2.666924718937235 - frontier/mean_frontier_pct:0.4930765050050033 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:3.101796152569999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:2.1204142999999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:3.186246938569999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.9978456999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.6453318129999999 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.701252438490056 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:3.0609004989592994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.5777625899999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.3023509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:3.142675871279899 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.022926982120727 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.863193989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.1481519899999992 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:3.364974986219899 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.4504632631592993 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:2.1880699999999997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9540999999999997 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.0974499899999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9250575869999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:2.238863946340999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9184240534299997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.5989394099999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.1750101978589997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.1979114767699994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.2659322690999995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.2622812810999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:3.841252999999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.0810687325699995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:1.3558450069999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.581645006999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.5989394099999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.1455970902115094 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.2657524750999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.422539895099999 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.4948081127989994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:1.343 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.3423519899999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:2.585427148369999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.4517504950999998 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:3.5560145683699993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.4076924750999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:2.0480374099999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:260.0 - cluster/prob_snapshot/cluster_0:0.023247522750806573 - cluster/prob_snapshot/cluster_1:0.02114656378660914 - cluster/prob_snapshot/cluster_2:0.014455971328688486 - cluster/prob_snapshot/cluster_3:0.021722308885621728 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013620357190735592 - cluster/prob_snapshot/cluster_6:0.011217085979332929 - cluster/prob_snapshot/cluster_7:0.01841584816814436 - cluster/prob_snapshot/cluster_8:0.020867756764762393 - cluster/prob_snapshot/cluster_9:0.017573953398260785 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.022513850889076602 - cluster/prob_snapshot/cluster_12:0.021425262171917692 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02060887147490754 - cluster/prob_snapshot/cluster_15:0.01951988827273669 - cluster/prob_snapshot/cluster_16:0.02035820298662184 - cluster/prob_snapshot/cluster_17:0.013101300306027098 - cluster/prob_snapshot/cluster_18:0.021462595732255528 - cluster/prob_snapshot/cluster_19:0.022513850889076602 - cluster/prob_snapshot/cluster_20:0.02294079129844993 - cluster/prob_snapshot/cluster_21:0.01670608742557304 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01613093175315824 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01491721555790461 - cluster/prob_snapshot/cluster_26:0.013322119914674302 - cluster/prob_snapshot/cluster_27:0.014299411637998265 - cluster/prob_snapshot/cluster_28:0.019941644711806896 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.015263504409086396 - cluster/prob_snapshot/cluster_31:0.016423812762820796 - cluster/prob_snapshot/cluster_32:0.013078898360877129 - cluster/prob_snapshot/cluster_33:0.017718326836391626 - cluster/prob_snapshot/cluster_34:0.021645702157612908 - cluster/prob_snapshot/cluster_35:0.021801832132461552 - cluster/prob_snapshot/cluster_36:0.01544804329741581 - cluster/prob_snapshot/cluster_37:0.011219230635431775 - cluster/prob_snapshot/cluster_38:0.015423152606078091 - cluster/prob_snapshot/cluster_39:0.020177454660753935 - cluster/prob_snapshot/cluster_40:0.02618782717803715 - cluster/prob_snapshot/cluster_41:0.021005254142905227 - cluster/prob_snapshot/cluster_42:0.009243503284870997 - cluster/prob_snapshot/cluster_43:0.021895465502711264 - cluster/prob_snapshot/cluster_44:0.017600421861918104 - cluster/prob_snapshot/cluster_45:0.017718326836391626 - cluster/prob_snapshot/cluster_46:0.021445177646511587 - cluster/prob_snapshot/cluster_47:0.022264339637135564 - cluster/prob_snapshot/cluster_48:0.01651571924702149 - cluster/prob_snapshot/cluster_49:0.023825931506824046 - cluster/prob_snapshot/cluster_50:0.009155932165911462 - cluster/prob_snapshot/cluster_51:0.017405517720441476 - cluster/prob_snapshot/cluster_52:0.015969036432708653 - cluster/prob_snapshot/cluster_53:0.013021467190536777 - cluster/prob_snapshot/cluster_54:0.017626206694252883 - cluster/prob_snapshot/cluster_55:0.02353238524704907 - cluster/prob_snapshot/cluster_56:0.024243207869686256 - cluster/prob_snapshot/cluster_57:0.017573953398260785 - cluster/prob_snapshot/cluster_58:0.01641449663320258 - cluster/prob_snapshot/cluster_59:0.023163938387186677 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.013962540282359642
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 20:03:54,464:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 20:03:56,899:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  33%|███▎      | 261/800 [8:33:54<22:31:56, 150.49s/it]
[36m(TaskRunner pid=2823680)[0m step:261 - global_seqlen/min:407472 - global_seqlen/max:510594 - global_seqlen/minmax_diff:103122 - global_seqlen/balanced_min:452960 - global_seqlen/balanced_max:453081 - global_seqlen/mean:453030.5 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.1912905896163505) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010026265867054462 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.00980343873379752) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0010461265357425082) - actor/ppo_kl:np.float64(-0.0004974045242187972) - actor/pg_clipfrac_lower:np.float64(0.00013172485598814725) - actor/grad_norm:np.float64(0.24336348722378412) - perf/mfu/actor:np.float64(0.2479105446404664) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(113.44389343261719) - actor/lr:np.float64(1e-06) - training/global_step:261 - training/epoch:0 - critic/score/mean:0.5964673757553101 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6007493734359741 - critic/rewards/max:1.0868831872940063 - critic/rewards/min:-0.06220376864075661 - critic/advantages/mean:-0.0975644439458847 - critic/advantages/max:2.474248170852661 - critic/advantages/min:-2.4748477935791016 - critic/returns/mean:-0.0975644439458847 - critic/returns/max:2.474248170852661 - critic/returns/min:-2.4748477935791016 - response_length/mean:1389.6005859375 - response_length/max:8192.0 - response_length/min:145.0 - response_length/clip_ratio:0.036684781312942505 - response_length_non_aborted/mean:1389.6005859375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:145.0 - response_length_non_aborted/clip_ratio:0.036684781312942505 - response/aborted_ratio:0.0 - prompt_length/mean:237.67391967773438 - prompt_length/max:646.0 - prompt_length/min:162.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.95192676782608e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.185778877697885) - timing_s/agent_loop/generate_sequences/max:np.float64(34.93233000487089) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.852034329163871) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.93233000487089) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:221 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.50804460700601 - timing_s/reward:0.0001361556351184845 - timing_s/old_log_prob:12.739607355557382 - timing_s/ref:28.840902633033693 - timing_s/adv:0.07279390096664429 - timing_s/update_actor:21.959110187366605 - timing_s/update_weights:41.383606783114374 - timing_s/step:142.9068524185568 - timing_s/stop_profile:0.00010367017239332199 - timing_per_token_ms/adv:6.077939486591868e-05 - timing_per_token_ms/update_actor:0.018334797438507144 - timing_per_token_ms/gen:0.03667386096548509 - timing_per_token_ms/ref:0.02408076207134303 - perf/total_num_tokens:1812122 - perf/time_per_step:142.9068524185568 - perf/throughput:3170.1104064144433 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:1152.0 - frontier/mean_score:2.67792112369606 - frontier/mean_frontier_pct:0.501509198490884 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:3.101796152569999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:2.1204142999999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:3.186246938569999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.9978456999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:2.0517322690999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.701252438490056 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:3.0609004989592994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.5777625899999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:3.3023509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:3.142675871279899 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.022926982120727 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.863193989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.1481519899999992 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:3.364974986219899 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.4504632631592993 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:2.4316489999999997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9540999999999997 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9250575869999995 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:2.238863946340999 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9184240534299997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:2.7192575869999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:3.1225071385012995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.7385380337389993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.2659322690999995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.4835968967699995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9717524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.188877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.0567481127989993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:1.3558450069999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.581645006999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.7192575869999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.1455970902115094 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.2657524750999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:1.9957779265699993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.4948081127989994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.3423519899999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.709799003858999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:3.5560145683699993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.9853847325699994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:2.0480374099999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:261.0 - cluster/prob_snapshot/cluster_0:0.023580802820615178 - cluster/prob_snapshot/cluster_1:0.0214497241848399 - cluster/prob_snapshot/cluster_2:0.014663214362074024 - cluster/prob_snapshot/cluster_3:0.022033723254344214 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013815620731027814 - cluster/prob_snapshot/cluster_6:0.014188260320352416 - cluster/prob_snapshot/cluster_7:0.018679860606323432 - cluster/prob_snapshot/cluster_8:0.02116692014254929 - cluster/prob_snapshot/cluster_9:0.017825896303239008 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.022836612925978435 - cluster/prob_snapshot/cluster_12:0.02173241803316216 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.020904323433271383 - cluster/prob_snapshot/cluster_15:0.01979972840004523 - cluster/prob_snapshot/cluster_16:0.020650061322896558 - cluster/prob_snapshot/cluster_17:0.013289122566806462 - cluster/prob_snapshot/cluster_18:0.021770286813176047 - cluster/prob_snapshot/cluster_19:0.022836612925978435 - cluster/prob_snapshot/cluster_20:0.02326967401887427 - cluster/prob_snapshot/cluster_21:0.0169455884701835 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01636218727738754 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01681548296496724 - cluster/prob_snapshot/cluster_26:0.013513107879403026 - cluster/prob_snapshot/cluster_27:0.016376820368470744 - cluster/prob_snapshot/cluster_28:0.02022753120444056 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01548232436119539 - cluster/prob_snapshot/cluster_31:0.016659267074352824 - cluster/prob_snapshot/cluster_32:0.013266399463917519 - cluster/prob_snapshot/cluster_33:0.018804370874067937 - cluster/prob_snapshot/cluster_34:0.021592946019535392 - cluster/prob_snapshot/cluster_35:0.0258529592964364 - cluster/prob_snapshot/cluster_36:0.01566950882747494 - cluster/prob_snapshot/cluster_37:0.011380071255288608 - cluster/prob_snapshot/cluster_38:0.017174716132748367 - cluster/prob_snapshot/cluster_39:0.02055043845602029 - cluster/prob_snapshot/cluster_40:0.02896717063909774 - cluster/prob_snapshot/cluster_41:0.021138205316214364 - cluster/prob_snapshot/cluster_42:0.00937601957286779 - cluster/prob_snapshot/cluster_43:0.022209362271388793 - cluster/prob_snapshot/cluster_44:0.01785274422287149 - cluster/prob_snapshot/cluster_45:0.018804370874067937 - cluster/prob_snapshot/cluster_46:0.021752619019069844 - cluster/prob_snapshot/cluster_47:0.0225835246422669 - cluster/prob_snapshot/cluster_48:0.013801321541922979 - cluster/prob_snapshot/cluster_49:0.024167503733721805 - cluster/prob_snapshot/cluster_50:0.008575612855661273 - cluster/prob_snapshot/cluster_51:0.017655045905577777 - cluster/prob_snapshot/cluster_52:0.016197971000667496 - cluster/prob_snapshot/cluster_53:0.013208144951466036 - cluster/prob_snapshot/cluster_54:0.01873896232058007 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.024590762234846013 - cluster/prob_snapshot/cluster_57:0.017825896303239008 - cluster/prob_snapshot/cluster_58:0.013729449912152976 - cluster/prob_snapshot/cluster_59:0.023496020178673725 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.014162709412201609
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 20:06:21,982:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  33%|███▎      | 262/800 [8:36:01<21:25:58, 143.42s/it]
[36m(TaskRunner pid=2823680)[0m step:262 - global_seqlen/min:356992 - global_seqlen/max:474833 - global_seqlen/minmax_diff:117841 - global_seqlen/balanced_min:405467 - global_seqlen/balanced_max:405925 - global_seqlen/mean:405713.75 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.2187747713988242) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013260706327855587 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.047231535078026354) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0004782302478459947) - actor/ppo_kl:np.float64(0.00023263144577148742) - actor/pg_clipfrac_lower:np.float64(4.982469244682959e-06) - actor/grad_norm:np.float64(0.30141738678018254) - perf/mfu/actor:np.float64(0.23256335013990143) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(113.0787763595581) - actor/lr:np.float64(1e-06) - training/global_step:262 - training/epoch:0 - critic/score/mean:0.5686812996864319 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5610470771789551 - critic/rewards/max:1.4024324417114258 - critic/rewards/min:-0.337013304233551 - critic/advantages/mean:-0.04985588788986206 - critic/advantages/max:2.4748268127441406 - critic/advantages/min:-2.4748551845550537 - critic/returns/mean:-0.04985588788986206 - critic/returns/max:2.4748268127441406 - critic/returns/min:-2.4748551845550537 - response_length/mean:1161.5343017578125 - response_length/max:8192.0 - response_length/min:168.0 - response_length/clip_ratio:0.006868131924420595 - response_length_non_aborted/mean:1161.5343017578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:168.0 - response_length_non_aborted/clip_ratio:0.006868131924420595 - response/aborted_ratio:0.0 - prompt_length/mean:250.06593322753906 - prompt_length/max:434.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.638575673103333e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2987063713371754) - timing_s/agent_loop/generate_sequences/max:np.float64(28.974136155098677) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.285053989402513) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(28.974136155098677) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:30.591655946336687 - timing_s/reward:0.00020284578204154968 - timing_s/old_log_prob:11.61259615700692 - timing_s/ref:26.52969573903829 - timing_s/adv:0.08281078655272722 - timing_s/update_actor:20.534038566052914 - timing_s/update_weights:36.84891445375979 - timing_s/step:126.66956110484898 - timing_s/stop_profile:6.191059947013855e-05 - timing_per_token_ms/adv:8.058306764760907e-05 - timing_per_token_ms/update_actor:0.019981645963394863 - timing_per_token_ms/gen:0.0361775833480212 - timing_per_token_ms/ref:0.02581601208494985 - perf/total_num_tokens:1622855 - perf/time_per_step:126.66956110484898 - perf/throughput:3202.9300998696604 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:1189.0 - frontier/mean_score:2.6789058498955565 - frontier/mean_frontier_pct:0.5168693340278945 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.409966619299999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.471257306798999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:2.1204142999999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:3.186246938569999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:1.9978456999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.7362125883699997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.701252438490056 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:3.0609004989592994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.5777625899999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:3.211645699999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:3.142675871279899 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.022926982120727 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.863193989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.1481519899999992 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:3.2554824903539292 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.6153242842115096 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:2.4316489999999997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9540999999999997 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.9475403108999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:144.0 - frontier/cluster_30/score:1.8672047624386994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.6428968374009998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.4034803108999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:3.1225071385012995 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.7385380337389993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.4861525883699995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.4835968967699995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9802267325699994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.188877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.0567481127989993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:1.8490915048999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.7071515048999992 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.8034803108999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:3.1019179631480562 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.2657524750999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:1.9957779265699993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.4948081127989994 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.3423519899999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.796859302701299 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:3.5560145683699993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.9853847325699994 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:2.0480374099999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:262.0 - cluster/prob_snapshot/cluster_0:0.02357213486599351 - cluster/prob_snapshot/cluster_1:0.017083102865211065 - cluster/prob_snapshot/cluster_2:0.014657824381178753 - cluster/prob_snapshot/cluster_3:0.02202562398330718 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013810542313024927 - cluster/prob_snapshot/cluster_6:0.012001946604830602 - cluster/prob_snapshot/cluster_7:0.018672994165629852 - cluster/prob_snapshot/cluster_8:0.021159139495525867 - cluster/prob_snapshot/cluster_9:0.017819343767202706 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.022201198438044818 - cluster/prob_snapshot/cluster_12:0.02172442951747151 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.020896639312917437 - cluster/prob_snapshot/cluster_15:0.019792450312500945 - cluster/prob_snapshot/cluster_16:0.0206424706655367 - cluster/prob_snapshot/cluster_17:0.01328423768174775 - cluster/prob_snapshot/cluster_18:0.021762284377481518 - cluster/prob_snapshot/cluster_19:0.022828218524563824 - cluster/prob_snapshot/cluster_20:0.02250422977227156 - cluster/prob_snapshot/cluster_21:0.018078997136457883 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.016356172779157324 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.016809301841941425 - cluster/prob_snapshot/cluster_26:0.013508140660653629 - cluster/prob_snapshot/cluster_27:0.016370800491332504 - cluster/prob_snapshot/cluster_28:0.02037551257488559 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012907458458248963 - cluster/prob_snapshot/cluster_31:0.016653143373979717 - cluster/prob_snapshot/cluster_32:0.011356881209024974 - cluster/prob_snapshot/cluster_33:0.02352729685041827 - cluster/prob_snapshot/cluster_34:0.021585008771695727 - cluster/prob_snapshot/cluster_35:0.025843456130673893 - cluster/prob_snapshot/cluster_36:0.01718606973417433 - cluster/prob_snapshot/cluster_37:0.011375888109278585 - cluster/prob_snapshot/cluster_38:0.017168402960919097 - cluster/prob_snapshot/cluster_39:0.02060146456383794 - cluster/prob_snapshot/cluster_40:0.02895652273526987 - cluster/prob_snapshot/cluster_41:0.021130435224336736 - cluster/prob_snapshot/cluster_42:0.012782246631497312 - cluster/prob_snapshot/cluster_43:0.022201198438044818 - cluster/prob_snapshot/cluster_44:0.018713772743405837 - cluster/prob_snapshot/cluster_45:0.019379666536518174 - cluster/prob_snapshot/cluster_46:0.021442681625306867 - cluster/prob_snapshot/cluster_47:0.02257522327236505 - cluster/prob_snapshot/cluster_48:0.013796248380090683 - cluster/prob_snapshot/cluster_49:0.024158620116515196 - cluster/prob_snapshot/cluster_50:0.008572460587112515 - cluster/prob_snapshot/cluster_51:0.017648556171622694 - cluster/prob_snapshot/cluster_52:0.01619201686591369 - cluster/prob_snapshot/cluster_53:0.013203289832581974 - cluster/prob_snapshot/cluster_54:0.01933389737932899 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.024581723034069625 - cluster/prob_snapshot/cluster_57:0.017819343767202706 - cluster/prob_snapshot/cluster_58:0.013724403169269608 - cluster/prob_snapshot/cluster_59:0.02348738338889825 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.014157503409529065
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 20:08:26,716:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  33%|███▎      | 263/800 [8:38:17<21:03:58, 141.23s/it]
[36m(TaskRunner pid=2823680)[0m step:263 - global_seqlen/min:325522 - global_seqlen/max:518632 - global_seqlen/minmax_diff:193110 - global_seqlen/balanced_min:416297 - global_seqlen/balanced_max:416422 - global_seqlen/mean:416379.5 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.22032775522934067) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012636781670153141 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.013447739740513498) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009426449058486873) - actor/ppo_kl:np.float64(-0.0008881416730901037) - actor/pg_clipfrac_lower:np.float64(8.270234033665878e-05) - actor/grad_norm:np.float64(0.3033891941110293) - perf/mfu/actor:np.float64(0.2355465088779325) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(113.29956817626953) - actor/lr:np.float64(1e-06) - training/global_step:263 - training/epoch:0 - critic/score/mean:0.6111111044883728 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6116229891777039 - critic/rewards/max:1.1805250644683838 - critic/rewards/min:-0.0639980360865593 - critic/advantages/mean:-0.10387232899665833 - critic/advantages/max:2.474771738052368 - critic/advantages/min:-2.474839925765991 - critic/returns/mean:-0.10387232899665833 - critic/returns/max:2.474771738052368 - critic/returns/min:-2.474839925765991 - response_length/mean:1282.629150390625 - response_length/max:8192.0 - response_length/min:124.0 - response_length/clip_ratio:0.02777777798473835 - response_length_non_aborted/mean:1282.629150390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:124.0 - response_length_non_aborted/clip_ratio:0.02777777798473835 - response/aborted_ratio:0.0 - prompt_length/mean:235.07777404785156 - prompt_length/max:728.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.366629481315613e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5379042699933052) - timing_s/agent_loop/generate_sequences/max:np.float64(31.855252305977046) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.718410964833311) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.855252305977046) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.08453112188727 - timing_s/reward:0.00013803225010633469 - timing_s/old_log_prob:12.630633992142975 - timing_s/ref:28.071910408325493 - timing_s/adv:0.06777777336537838 - timing_s/update_actor:20.952349172905087 - timing_s/update_weights:39.63452435377985 - timing_s/step:135.8568534553051 - timing_s/stop_profile:6.537977606058121e-05 - timing_per_token_ms/adv:6.20250152279969e-05 - timing_per_token_ms/update_actor:0.019173981557434588 - timing_per_token_ms/gen:0.036908272311633405 - timing_per_token_ms/ref:0.025689257467474684 - perf/total_num_tokens:1665518 - perf/time_per_step:135.8568534553051 - perf/throughput:3064.839862031566 - frontier/active_count:53.0 - frontier/completed_count:11.0 - frontier/blacklisted_count:1225.0 - frontier/mean_score:2.668389951424038 - frontier/mean_frontier_pct:0.5172274015102215 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.2869766335099992 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.471257306798999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:2.1204142999999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:3.186246938569999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.2984919899999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.7362125883699997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.701252438490056 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:3.0609004989592994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.7044338129999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:3.211645699999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:3.099873109895929 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.022926982120727 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.863193989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.1481519899999992 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.3023509999999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.6153242842115096 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:80.0 - frontier/cluster_25/score:2.6021542999999996 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9540999999999997 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9632782176299997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.6070433337070895 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.6428968374009998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.2824362176299995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:3.0857549969509095 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.7385380337389993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.4861525883699995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.4835968967699995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9802267325699994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.188877099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.0567481127989993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:128.0 - frontier/cluster_42/score:2.1943640534299993 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.7071515048999992 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.8034803108999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.4713425742036392 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.2657524750999998 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:1.9957779265699993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.3463656789592995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.3423519899999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.91 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.796859302701299 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:3.3892101978589992 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.2897693127989993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:2.3336261869999992 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:263.0 - cluster/prob_snapshot/cluster_0:0.023241890494401368 - cluster/prob_snapshot/cluster_1:0.01747401886662571 - cluster/prob_snapshot/cluster_2:0.014993242258231832 - cluster/prob_snapshot/cluster_3:0.022529640667170336 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01625241219825549 - cluster/prob_snapshot/cluster_6:0.01227658950858007 - cluster/prob_snapshot/cluster_7:0.019100291962245726 - cluster/prob_snapshot/cluster_8:0.021643328244503667 - cluster/prob_snapshot/cluster_9:0.019122787150446328 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.022709232826673798 - cluster/prob_snapshot/cluster_12:0.02191889976711069 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.021374821218608887 - cluster/prob_snapshot/cluster_15:0.02024536484421153 - cluster/prob_snapshot/cluster_16:0.021114836380100407 - cluster/prob_snapshot/cluster_17:0.01358822350430964 - cluster/prob_snapshot/cluster_18:0.022260275009371815 - cluster/prob_snapshot/cluster_19:0.023350601137105208 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.01849270238227498 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.016730454296497832 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.018399578711197934 - cluster/prob_snapshot/cluster_26:0.013817250099101303 - cluster/prob_snapshot/cluster_27:0.016745416737486536 - cluster/prob_snapshot/cluster_28:0.020953050635183896 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01136324633433518 - cluster/prob_snapshot/cluster_31:0.017034220527826174 - cluster/prob_snapshot/cluster_32:0.011616762954501912 - cluster/prob_snapshot/cluster_33:0.023209785657510786 - cluster/prob_snapshot/cluster_34:0.02181907197043258 - cluster/prob_snapshot/cluster_35:0.02643483701815372 - cluster/prob_snapshot/cluster_36:0.017579341946694818 - cluster/prob_snapshot/cluster_37:0.011636204793390796 - cluster/prob_snapshot/cluster_38:0.017561270901193884 - cluster/prob_snapshot/cluster_39:0.021072891927714645 - cluster/prob_snapshot/cluster_40:0.02961914053789375 - cluster/prob_snapshot/cluster_41:0.021613967127833634 - cluster/prob_snapshot/cluster_42:0.015516133736615322 - cluster/prob_snapshot/cluster_43:0.022709232826673798 - cluster/prob_snapshot/cluster_44:0.019142003684234053 - cluster/prob_snapshot/cluster_45:0.019823135255929367 - cluster/prob_snapshot/cluster_46:0.017474621784109568 - cluster/prob_snapshot/cluster_47:0.023091816544811322 - cluster/prob_snapshot/cluster_48:0.014111950644124417 - cluster/prob_snapshot/cluster_49:0.023661824629870317 - cluster/prob_snapshot/cluster_50:0.008768625888079182 - cluster/prob_snapshot/cluster_51:0.01805241155221647 - cluster/prob_snapshot/cluster_52:0.016562541971218276 - cluster/prob_snapshot/cluster_53:0.013505423309596998 - cluster/prob_snapshot/cluster_54:0.019776318754117974 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.02396477403522973 - cluster/prob_snapshot/cluster_57:0.0182271073139231 - cluster/prob_snapshot/cluster_58:0.016190735000353666 - cluster/prob_snapshot/cluster_59:0.024024849800622475 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.016500842671097252
[36m(TaskRunner pid=2823680)[0m Training Progress:  33%|███▎      | 264/800 [8:40:38<21:00:37, 141.11s/it]
[36m(TaskRunner pid=2823680)[0m step:264 - global_seqlen/min:332291 - global_seqlen/max:454152 - global_seqlen/minmax_diff:121861 - global_seqlen/balanced_min:403088 - global_seqlen/balanced_max:403258 - global_seqlen/mean:403185.0 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.1827387818686512) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011078165844082832 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.032703672819479834) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006762110541024654) - actor/ppo_kl:np.float64(-0.0012360007288892269) - actor/pg_clipfrac_lower:np.float64(0.00013616320493569294) - actor/grad_norm:np.float64(0.29800739884376526) - perf/mfu/actor:np.float64(0.18991876257132057) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(113.25961303710938) - actor/lr:np.float64(1e-06) - training/global_step:264 - training/epoch:0 - critic/score/mean:0.6662371158599854 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.671278715133667 - critic/rewards/max:1.2307325601577759 - critic/rewards/min:-0.054204076528549194 - critic/advantages/mean:-0.13984228670597076 - critic/advantages/max:2.4747331142425537 - critic/advantages/min:-2.4747979640960693 - critic/returns/mean:-0.13984228670597076 - critic/returns/max:2.4747331142425537 - critic/returns/min:-2.4747979640960693 - response_length/mean:1235.0076904296875 - response_length/max:8192.0 - response_length/min:208.0 - response_length/clip_ratio:0.0335051529109478 - response_length_non_aborted/mean:1235.0076904296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:208.0 - response_length_non_aborted/clip_ratio:0.0335051529109478 - response/aborted_ratio:0.0 - prompt_length/mean:231.2577362060547 - prompt_length/max:497.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.657295256853104e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5795166660100222) - timing_s/agent_loop/generate_sequences/max:np.float64(31.51045950781554) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.921955923193309) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.51045950781554) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.65908073540777 - timing_s/reward:0.00013905391097068787 - timing_s/old_log_prob:15.33906396664679 - timing_s/ref:28.280411827377975 - timing_s/adv:0.08324742317199707 - timing_s/update_actor:25.301354226656258 - timing_s/update_weights:36.565347546711564 - timing_s/step:140.62319560721517 - timing_s/stop_profile:7.266364991664886e-05 - timing_per_token_ms/adv:7.3163836849698e-05 - timing_per_token_ms/update_actor:0.022236654087068326 - timing_per_token_ms/gen:0.03616476454236458 - timing_per_token_ms/ref:0.024854864668971046 - perf/total_num_tokens:1612740 - perf/time_per_step:140.62319560721517 - perf/throughput:2867.1301221611066 - frontier/active_count:51.0 - frontier/completed_count:13.0 - frontier/blacklisted_count:1256.0 - frontier/mean_score:2.6498262259474883 - frontier/mean_frontier_pct:0.5135299985417213 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.2008836434569994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.471257306798999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:2.1204142999999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:3.186246938569999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.2984919899999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.7362125883699997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.790876706943039 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.4426303492715093 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.7044338129999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:3.211645699999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:3.099873109895929 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.0160488874845086 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.863193989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.103706392999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.211645699999999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.6153242842115096 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:80.0 - frontier/cluster_25/score:2.6021542999999996 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.9540999999999997 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9632782176299997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.6070433337070895 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.6428968374009998 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.2824362176299995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.5169766236172992 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.4861525883699995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.4835968967699995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9802267325699994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.8322139699999993 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.0397236789592994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:1.8360548374009995 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:2.7071515048999992 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.4713425742036392 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.78602673257 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:1.9957779265699993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.2424559752715094 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:1.2401 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.3423519899999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.237 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.796859302701299 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:3.3892101978589992 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:2.3336261869999992 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:264.0 - cluster/prob_snapshot/cluster_0:0.023685486907379396 - cluster/prob_snapshot/cluster_1:0.018286491826905952 - cluster/prob_snapshot/cluster_2:0.015690368890331936 - cluster/prob_snapshot/cluster_3:0.023577180101951818 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017008085266437385 - cluster/prob_snapshot/cluster_6:0.012847402502220124 - cluster/prob_snapshot/cluster_7:0.02065157033668897 - cluster/prob_snapshot/cluster_8:0.018074661750201518 - cluster/prob_snapshot/cluster_9:0.02001192133323048 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02376512258856598 - cluster/prob_snapshot/cluster_12:0.022938042158783593 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.022317770463964135 - cluster/prob_snapshot/cluster_15:0.021186694462342268 - cluster/prob_snapshot/cluster_16:0.022096592995479742 - cluster/prob_snapshot/cluster_17:0.014220022305705113 - cluster/prob_snapshot/cluster_18:0.022966407193844872 - cluster/prob_snapshot/cluster_19:0.02376512258856598 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.019352540108375005 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.017508354436863983 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01925508655386991 - cluster/prob_snapshot/cluster_26:0.014459697733880419 - cluster/prob_snapshot/cluster_27:0.017524012572347234 - cluster/prob_snapshot/cluster_28:0.02192728485150285 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.011891592472571537 - cluster/prob_snapshot/cluster_31:0.017826244599903986 - cluster/prob_snapshot/cluster_32:0.012156896615713907 - cluster/prob_snapshot/cluster_33:0.024288949151871207 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.02602447106833298 - cluster/prob_snapshot/cluster_36:0.018396712014807137 - cluster/prob_snapshot/cluster_37:0.012177242423433082 - cluster/prob_snapshot/cluster_38:0.018377800736962085 - cluster/prob_snapshot/cluster_39:0.022052698291485734 - cluster/prob_snapshot/cluster_40:0.028357123820558765 - cluster/prob_snapshot/cluster_41:0.022492956139537605 - cluster/prob_snapshot/cluster_42:0.0135862023292807 - cluster/prob_snapshot/cluster_43:0.02376512258856598 - cluster/prob_snapshot/cluster_44:0.02003203136004989 - cluster/prob_snapshot/cluster_45:0.021181087195865995 - cluster/prob_snapshot/cluster_46:0.018287122777674924 - cluster/prob_snapshot/cluster_47:0.028015353444221444 - cluster/prob_snapshot/cluster_48:0.0147681006919568 - cluster/prob_snapshot/cluster_49:0.023993108498971637 - cluster/prob_snapshot/cluster_50:0.009176332408671568 - cluster/prob_snapshot/cluster_51:0.0188917774912129 - cluster/prob_snapshot/cluster_52:0.017332634850700215 - cluster/prob_snapshot/cluster_53:0.016553064751389644 - cluster/prob_snapshot/cluster_54:0.020695839578963385 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.025079041511502063 - cluster/prob_snapshot/cluster_57:0.019074595916843927 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025141910563048517 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.01726806677646379
[36m(TaskRunner pid=2823680)[0m Training Progress:  33%|███▎      | 265/800 [8:42:55<20:47:15, 139.88s/it]
[36m(TaskRunner pid=2823680)[0m step:265 - global_seqlen/min:445335 - global_seqlen/max:473818 - global_seqlen/minmax_diff:28483 - global_seqlen/balanced_min:458285 - global_seqlen/balanced_max:458429 - global_seqlen/mean:458372.75 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.18607661219151772) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01085173524916172 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.045979820657521486) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006896798715941892) - actor/ppo_kl:np.float64(0.0007909505724426155) - actor/pg_clipfrac_lower:np.float64(1.0773675235807035e-05) - actor/grad_norm:np.float64(0.2881462400158246) - perf/mfu/actor:np.float64(0.25218970894412374) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(113.30689525604248) - actor/lr:np.float64(1e-06) - training/global_step:265 - training/epoch:0 - critic/score/mean:0.5782967209815979 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5807386040687561 - critic/rewards/max:1.364332675933838 - critic/rewards/min:-0.0739600732922554 - critic/advantages/mean:-0.09076876938343048 - critic/advantages/max:2.474593162536621 - critic/advantages/min:-2.474843978881836 - critic/returns/mean:-0.09076876938343048 - critic/returns/max:2.474593162536621 - critic/returns/min:-2.474843978881836 - response_length/mean:1326.995849609375 - response_length/max:8192.0 - response_length/min:237.0 - response_length/clip_ratio:0.030219780281186104 - response_length_non_aborted/mean:1326.995849609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:237.0 - response_length_non_aborted/clip_ratio:0.030219780281186104 - response/aborted_ratio:0.0 - prompt_length/mean:234.54945373535156 - prompt_length/max:353.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.334778249263763e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7800981029868126) - timing_s/agent_loop/generate_sequences/max:np.float64(32.99844110291451) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.218502757998067) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.99844110291451) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:185 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.791913849301636 - timing_s/reward:0.00013365689665079117 - timing_s/old_log_prob:12.703201936557889 - timing_s/ref:28.263631163164973 - timing_s/adv:0.07448825240135193 - timing_s/update_actor:21.720349025912583 - timing_s/update_weights:38.844877927564085 - timing_s/step:136.7706905864179 - timing_s/stop_profile:5.9497542679309845e-05 - timing_per_token_ms/adv:6.552421250905119e-05 - timing_per_token_ms/update_actor:0.019106486183569374 - timing_per_token_ms/gen:0.03601449801336121 - timing_per_token_ms/ref:0.02486233889115985 - perf/total_num_tokens:1833491 - perf/time_per_step:136.7706905864179 - perf/throughput:3351.3960340090507 - frontier/active_count:50.0 - frontier/completed_count:14.0 - frontier/blacklisted_count:1293.0 - frontier/mean_score:2.7045280104081035 - frontier/mean_frontier_pct:0.5393548319654592 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.2008836434569994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.471257306798999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:80.0 - frontier/cluster_2/score:2.38429001 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:3.186246938569999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.2984919899999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.7362125883699997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:2.790876706943039 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.609841244490056 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.7044338129999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.548151989999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:3.06991117692715 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.0160488874845086 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.863193989999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.103706392999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7481519899999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.6153242842115096 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:80.0 - frontier/cluster_25/score:2.6021542999999996 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:2.2678699999999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9632782176299997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.6070433337070895 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:2.0500277861806997 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:3.1977053523409995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.9618836365321095 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.4861525883699995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.6385178277389993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9802267325699994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.8322139699999993 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.0397236789592994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:1.8360548374009995 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.795006053429999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.4713425742036392 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.78602673257 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:128.0 - frontier/cluster_48/score:1.6970445485989996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.2424559752715094 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:2.4659 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.796859302701299 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:3.3892101978589992 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:96.0 - frontier/cluster_63/score:2.533538330899999 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:265.0 - cluster/prob_snapshot/cluster_0:0.023670552725937547 - cluster/prob_snapshot/cluster_1:0.01827496182171983 - cluster/prob_snapshot/cluster_2:0.017631838167874763 - cluster/prob_snapshot/cluster_3:0.023562314210154593 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.016997361322600355 - cluster/prob_snapshot/cluster_6:0.012839301953526535 - cluster/prob_snapshot/cluster_7:0.020638549101378363 - cluster/prob_snapshot/cluster_8:0.01929979082816924 - cluster/prob_snapshot/cluster_9:0.019999303409631984 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.018843598440790355 - cluster/prob_snapshot/cluster_12:0.02270201059196212 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02230369865556983 - cluster/prob_snapshot/cluster_15:0.02117333582038186 - cluster/prob_snapshot/cluster_16:0.02208266064397979 - cluster/prob_snapshot/cluster_17:0.014211056292295679 - cluster/prob_snapshot/cluster_18:0.02295192640679407 - cluster/prob_snapshot/cluster_19:0.027717605257372924 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.019340337938055716 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.017497315063135642 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.019242945829999695 - cluster/prob_snapshot/cluster_26:0.016770911532602586 - cluster/prob_snapshot/cluster_27:0.01751296332584586 - cluster/prob_snapshot/cluster_28:0.021913459252232714 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.011884094581550238 - cluster/prob_snapshot/cluster_31:0.017815004789959493 - cluster/prob_snapshot/cluster_32:0.015159967123959332 - cluster/prob_snapshot/cluster_33:0.02364704924508049 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.029298151997577394 - cluster/prob_snapshot/cluster_36:0.01838511251354981 - cluster/prob_snapshot/cluster_37:0.012169564424305428 - cluster/prob_snapshot/cluster_38:0.019511854324192092 - cluster/prob_snapshot/cluster_39:0.022038793616489805 - cluster/prob_snapshot/cluster_40:0.028339244076985782 - cluster/prob_snapshot/cluster_41:0.02247877387301022 - cluster/prob_snapshot/cluster_42:0.013577635952263222 - cluster/prob_snapshot/cluster_43:0.023750138195206737 - cluster/prob_snapshot/cluster_44:0.020669085642106125 - cluster/prob_snapshot/cluster_45:0.021167732089401196 - cluster/prob_snapshot/cluster_46:0.018275592374661503 - cluster/prob_snapshot/cluster_47:0.027997689193825004 - cluster/prob_snapshot/cluster_48:0.012549654076926507 - cluster/prob_snapshot/cluster_49:0.023977980355856875 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01887986584109922 - cluster/prob_snapshot/cluster_52:0.014343696094368155 - cluster/prob_snapshot/cluster_53:0.01823534450750913 - cluster/prob_snapshot/cluster_54:0.020682790430994746 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.025063228665526592 - cluster/prob_snapshot/cluster_57:0.01906256899599294 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0251260580768568 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.018735530348733178
[36m(TaskRunner pid=2823680)[0m Training Progress:  33%|███▎      | 266/800 [8:45:09<20:28:57, 138.09s/it]
[36m(TaskRunner pid=2823680)[0m step:266 - global_seqlen/min:423162 - global_seqlen/max:487018 - global_seqlen/minmax_diff:63856 - global_seqlen/balanced_min:453150 - global_seqlen/balanced_max:453200 - global_seqlen/mean:453180.5 - frontier/skipped_zero_acc_count:26.0 - actor/entropy:np.float64(0.2058108159724404) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011746585369110107 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.010500232136109844) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000748916712872611) - actor/ppo_kl:np.float64(0.0014355604868404614) - actor/pg_clipfrac_lower:np.float64(4.562704852301692e-06) - actor/grad_norm:np.float64(0.3356384921532411) - perf/mfu/actor:np.float64(0.19819746966308235) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(114.03109359741211) - actor/lr:np.float64(1e-06) - training/global_step:266 - training/epoch:0 - critic/score/mean:0.6078431606292725 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6142640709877014 - critic/rewards/max:1.2237704992294312 - critic/rewards/min:-0.10629937052726746 - critic/advantages/mean:-0.08197477459907532 - critic/advantages/max:2.4746434688568115 - critic/advantages/min:-2.4747374057769775 - critic/returns/mean:-0.08197477459907532 - critic/returns/max:2.4746434688568115 - critic/returns/min:-2.4747374057769775 - response_length/mean:1381.4093017578125 - response_length/max:8192.0 - response_length/min:192.0 - response_length/clip_ratio:0.02818627469241619 - response_length_non_aborted/mean:1381.4093017578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:192.0 - response_length_non_aborted/clip_ratio:0.02818627469241619 - response/aborted_ratio:0.0 - prompt_length/mean:234.56863403320312 - prompt_length/max:409.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.593033999204636e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8955008704215288) - timing_s/agent_loop/generate_sequences/max:np.float64(32.740281916223466) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.460574980796082) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.740281916223466) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.42286680545658 - timing_s/reward:0.00015743356198072433 - timing_s/old_log_prob:14.21748848259449 - timing_s/ref:21.668680145405233 - timing_s/adv:0.08973647654056549 - timing_s/update_actor:27.206073329783976 - timing_s/update_weights:34.594389534555376 - timing_s/step:133.67153715435416 - timing_s/stop_profile:5.5390410125255585e-05 - timing_per_token_ms/adv:6.805239689783359e-05 - timing_per_token_ms/update_actor:0.02063195003464482 - timing_per_token_ms/gen:0.031424701973383054 - timing_per_token_ms/ref:0.016432622255240052 - perf/total_num_tokens:1812722 - perf/time_per_step:133.67153715435416 - perf/throughput:3390.2542728800986 - frontier/active_count:49.0 - frontier/completed_count:15.0 - frontier/blacklisted_count:1319.0 - frontier/mean_score:2.7063879510345563 - frontier/mean_frontier_pct:0.5431517596646953 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:15.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.2008836434569994 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.471257306798999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:80.0 - frontier/cluster_2/score:2.38429001 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:3.1303728569989993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.2984919899999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:2.1153488118589996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.609841244490056 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.7044338129999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.548151989999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:3.048937823849005 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:3.011234221239156 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:2.9042357929999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.9861587127989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.103706392999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7481519899999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.6153242842115096 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:80.0 - frontier/cluster_25/score:2.6021542999999996 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:2.2678699999999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9742947523409997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.6070433337070895 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.409058972999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:2.33501945032649 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:3.1383937466386995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.9618836365321095 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.4861525883699995 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.7469624794172995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:2.9861587127989995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:3.8322139699999993 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:3.0278065752715095 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:1.8360548374009995 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.211645699999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.856504237400999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.4713425742036392 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.5502187127989995 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.2424559752715094 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:2.4659 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.796859302701299 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.2724471385012994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:96.0 - frontier/cluster_63/score:2.533538330899999 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:266.0 - cluster/prob_snapshot/cluster_0:0.024137025870199887 - cluster/prob_snapshot/cluster_1:0.0186351046118335 - cluster/prob_snapshot/cluster_2:0.017979306986390386 - cluster/prob_snapshot/cluster_3:0.023605322482498237 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017332326571283727 - cluster/prob_snapshot/cluster_6:0.015951291794285175 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.019680129817945195 - cluster/prob_snapshot/cluster_9:0.020393427621793917 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.019214947294180693 - cluster/prob_snapshot/cluster_12:0.02299124220941492 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02270692920933011 - cluster/prob_snapshot/cluster_15:0.0219000820639306 - cluster/prob_snapshot/cluster_16:0.022517841362552023 - cluster/prob_snapshot/cluster_17:0.014491112114764594 - cluster/prob_snapshot/cluster_18:0.02340423765621087 - cluster/prob_snapshot/cluster_19:0.028263833248984686 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.019721475985549443 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01784213284868342 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01962216458125234 - cluster/prob_snapshot/cluster_26:0.0171014141586011 - cluster/prob_snapshot/cluster_27:0.01785808949009494 - cluster/prob_snapshot/cluster_28:0.022428378341588073 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01211829320928622 - cluster/prob_snapshot/cluster_31:0.018166083254228516 - cluster/prob_snapshot/cluster_32:0.017607770590211257 - cluster/prob_snapshot/cluster_33:0.023665805912169668 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.02987552766098597 - cluster/prob_snapshot/cluster_36:0.01874742603200073 - cluster/prob_snapshot/cluster_37:0.01240938877682629 - cluster/prob_snapshot/cluster_38:0.020714125165310623 - cluster/prob_snapshot/cluster_39:0.022517841362552023 - cluster/prob_snapshot/cluster_40:0.02889772264078053 - cluster/prob_snapshot/cluster_41:0.022831896994031278 - cluster/prob_snapshot/cluster_42:0.013845209025339858 - cluster/prob_snapshot/cluster_43:0.024218179722113854 - cluster/prob_snapshot/cluster_44:0.02154015089471332 - cluster/prob_snapshot/cluster_45:0.021584882405195297 - cluster/prob_snapshot/cluster_46:0.018635747591017042 - cluster/prob_snapshot/cluster_47:0.026771270205607642 - cluster/prob_snapshot/cluster_48:0.011220099660655927 - cluster/prob_snapshot/cluster_49:0.024450511944753863 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.019251929412411533 - cluster/prob_snapshot/cluster_52:0.014626365835753266 - cluster/prob_snapshot/cluster_53:0.018594706563292634 - cluster/prob_snapshot/cluster_54:0.021090384051480556 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.024676667457852432 - cluster/prob_snapshot/cluster_57:0.01943823307955846 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025621214715137848 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.019104749515365456
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 20:17:34,677:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  33%|███▎      | 267/800 [8:47:25<20:21:53, 137.55s/it]
[36m(TaskRunner pid=2823680)[0m step:267 - global_seqlen/min:293891 - global_seqlen/max:520407 - global_seqlen/minmax_diff:226516 - global_seqlen/balanced_min:421696 - global_seqlen/balanced_max:421952 - global_seqlen/mean:421881.25 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.20608682921156288) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014714694581925869 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09866900159977376) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0010058886701881419) - actor/ppo_kl:np.float64(-5.448674813139709e-05) - actor/pg_clipfrac_lower:np.float64(6.452205199821037e-05) - actor/grad_norm:np.float64(0.39828942716121674) - perf/mfu/actor:np.float64(0.22001397286193775) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(114.11645126342773) - actor/lr:np.float64(1e-06) - training/global_step:267 - training/epoch:0 - critic/score/mean:0.631313145160675 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6338590383529663 - critic/rewards/max:1.1558139324188232 - critic/rewards/min:-0.07530046999454498 - critic/advantages/mean:-0.10297419130802155 - critic/advantages/max:2.473747730255127 - critic/advantages/min:-2.474830150604248 - critic/returns/mean:-0.10297419130802155 - critic/returns/max:2.473747730255127 - critic/returns/min:-2.474830150604248 - response_length/mean:1254.11865234375 - response_length/max:8192.0 - response_length/min:136.0 - response_length/clip_ratio:0.034090910106897354 - response_length_non_aborted/mean:1254.11865234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:136.0 - response_length_non_aborted/clip_ratio:0.034090910106897354 - response/aborted_ratio:0.0 - prompt_length/mean:232.71717834472656 - prompt_length/max:434.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.091198444366455e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1610537124797702) - timing_s/agent_loop/generate_sequences/max:np.float64(33.387547641061246) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.29039406625634) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.387547641061246) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.96922935079783 - timing_s/reward:0.0002337861806154251 - timing_s/old_log_prob:13.582315833307803 - timing_s/ref:26.44042705465108 - timing_s/adv:0.0736874183639884 - timing_s/update_actor:22.896749664098024 - timing_s/update_weights:37.66992802824825 - timing_s/step:136.03518669959158 - timing_s/stop_profile:9.715743362903595e-05 - timing_per_token_ms/adv:6.257561593920077e-05 - timing_per_token_ms/update_actor:0.019444000686239696 - timing_per_token_ms/gen:0.035206450413685246 - timing_per_token_ms/ref:0.02245330404259187 - perf/total_num_tokens:1687525 - perf/time_per_step:136.03518669959158 - perf/throughput:3101.2656374827957 - frontier/active_count:48.0 - frontier/completed_count:16.0 - frontier/blacklisted_count:1348.0 - frontier/mean_score:2.729210332369419 - frontier/mean_frontier_pct:0.5534541378259076 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.1406185504198993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.471257306798999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:80.0 - frontier/cluster_2/score:2.569003007 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:3.1303728569989993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.2984919899999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:2.1153488118589996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.609841244490056 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.7044338129999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.548151989999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:3.034256476694303 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:3.011234221239156 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.3329650550999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.3903110989592995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.103706392999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7481519899999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.7307269989480565 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.366098934759299 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:96.0 - frontier/cluster_25/score:2.1215080099999994 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:2.2678699999999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:2.9742947523409997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.6070433337070895 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.5863412810999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:2.33501945032649 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:3.0968756226470893 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.9618836365321095 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:1.6456463929999998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:3.4228737355921095 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.5903110989592997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.1825497789999995 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:3.0278065752715095 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:1.8360548374009995 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.1481519899999992 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.856504237400999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.4713425742036392 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.9851530989592994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.2424559752715094 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:2.62613 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:2.796859302701299 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.2724471385012994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.397706392999999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:96.0 - frontier/cluster_63/score:2.533538330899999 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:267.0 - cluster/prob_snapshot/cluster_0:0.023973803835391435 - cluster/prob_snapshot/cluster_1:0.0188642577724238 - cluster/prob_snapshot/cluster_2:0.01961039621768595 - cluster/prob_snapshot/cluster_3:0.02389559368656665 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0175454596605217 - cluster/prob_snapshot/cluster_6:0.0161474425005086 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.01992213350091598 - cluster/prob_snapshot/cluster_9:0.020644202623713467 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.019451230695575795 - cluster/prob_snapshot/cluster_12:0.023161892598771515 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.022986153002488425 - cluster/prob_snapshot/cluster_15:0.017808608618933595 - cluster/prob_snapshot/cluster_16:0.018246357675097965 - cluster/prob_snapshot/cluster_17:0.014669307204662724 - cluster/prob_snapshot/cluster_18:0.02369203614952985 - cluster/prob_snapshot/cluster_19:0.028611389479781976 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.020844910755569185 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.018061534951280232 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.016194458527971058 - cluster/prob_snapshot/cluster_26:0.01731170775161472 - cluster/prob_snapshot/cluster_27:0.01807768780918144 - cluster/prob_snapshot/cluster_28:0.02270417683539563 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012267310091547483 - cluster/prob_snapshot/cluster_31:0.019742747337519354 - cluster/prob_snapshot/cluster_32:0.017824290774333736 - cluster/prob_snapshot/cluster_33:0.023639893698653932 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.030242902662651942 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.012561985218780138 - cluster/prob_snapshot/cluster_38:0.02612838909692724 - cluster/prob_snapshot/cluster_39:0.027406516459304146 - cluster/prob_snapshot/cluster_40:0.03192735008207205 - cluster/prob_snapshot/cluster_41:0.023112657497792125 - cluster/prob_snapshot/cluster_42:0.014015461539252514 - cluster/prob_snapshot/cluster_43:0.024031310087678885 - cluster/prob_snapshot/cluster_44:0.02180502698529245 - cluster/prob_snapshot/cluster_45:0.021850308552627793 - cluster/prob_snapshot/cluster_46:0.018864908658228485 - cluster/prob_snapshot/cluster_47:0.030420529304865423 - cluster/prob_snapshot/cluster_48:0.011358071587990575 - cluster/prob_snapshot/cluster_49:0.024751176320237614 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.01948866757751494 - cluster/prob_snapshot/cluster_52:0.014806224120910651 - cluster/prob_snapshot/cluster_53:0.02004647315663948 - cluster/prob_snapshot/cluster_54:0.02134972942485673 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.02498011283466088 - cluster/prob_snapshot/cluster_57:0.019677262193655475 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025936275051660365 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.01933967783076391
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▎      | 268/800 [8:49:43<20:20:59, 137.71s/it]
[36m(TaskRunner pid=2823680)[0m step:268 - global_seqlen/min:408986 - global_seqlen/max:487682 - global_seqlen/minmax_diff:78696 - global_seqlen/balanced_min:446364 - global_seqlen/balanced_max:446380 - global_seqlen/mean:446369.25 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.1955289872871204) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011346235871315002 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.020998103991587413) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007246991569693571) - actor/ppo_kl:np.float64(2.4788320233297284e-05) - actor/pg_clipfrac_lower:np.float64(2.8457283860916505e-05) - actor/grad_norm:np.float64(0.6801342720335181) - perf/mfu/actor:np.float64(0.24953855298637073) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(113.71555709838867) - actor/lr:np.float64(1e-06) - training/global_step:268 - training/epoch:0 - critic/score/mean:0.5568181872367859 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5620258450508118 - critic/rewards/max:1.15204656124115 - critic/rewards/min:-0.1440422534942627 - critic/advantages/mean:-0.055412791669368744 - critic/advantages/max:2.474106550216675 - critic/advantages/min:-2.4747636318206787 - critic/returns/mean:-0.055412791669368744 - critic/returns/max:2.474106550216675 - critic/returns/min:-2.4747636318206787 - response_length/mean:1366.132080078125 - response_length/max:8192.0 - response_length/min:272.0 - response_length/clip_ratio:0.02556818164885044 - response_length_non_aborted/mean:1366.132080078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:272.0 - response_length_non_aborted/clip_ratio:0.02556818164885044 - response/aborted_ratio:0.0 - prompt_length/mean:240.34091186523438 - prompt_length/max:543.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.391216397285461e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(2.429448749870062) - timing_s/agent_loop/generate_sequences/max:np.float64(32.81586157437414) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.423605330691316) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.81586157437414) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:197 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.11126829683781 - timing_s/reward:0.00020524952560663223 - timing_s/old_log_prob:12.026187222450972 - timing_s/ref:30.691924115642905 - timing_s/adv:0.07269276399165392 - timing_s/update_actor:21.238351673819125 - timing_s/update_weights:38.31457647122443 - timing_s/step:137.85066687408835 - timing_s/stop_profile:5.520787090063095e-05 - timing_per_token_ms/adv:6.427544459396238e-05 - timing_per_token_ms/update_actor:0.018779097413800103 - timing_per_token_ms/gen:0.036507421621925086 - timing_per_token_ms/ref:0.027138011538584497 - perf/total_num_tokens:1785477 - perf/time_per_step:137.85066687408835 - perf/throughput:3238.063769453578 - frontier/active_count:48.0 - frontier/completed_count:16.0 - frontier/blacklisted_count:1388.0 - frontier/mean_score:2.7067983162035656 - frontier/mean_frontier_pct:0.5698837191649258 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.1406185504198993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.0298801147592993 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:80.0 - frontier/cluster_2/score:2.569003007 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:3.091260999899299 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:2.1153488118589996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.609841244490056 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.7044338129999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.548151989999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:3.034256476694303 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:3.011234221239156 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.5330755385699995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.3903110989592995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.103706392999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7481519899999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.7307269989480565 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.556269254331509 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:96.0 - frontier/cluster_25/score:2.1215080099999994 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:2.2678699999999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9820063266387 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:1.4249303335949626 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.5863412810999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:160.0 - frontier/cluster_32/score:1.934513615228543 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:3.0968756226470893 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.9618836365321095 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:3.4228737355921095 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.4132177692715095 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.1825497789999995 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:3.0194646026900562 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:2.1852383861806994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:3.1481519899999992 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.856504237400999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:2.4713425742036392 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.9851530989592994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.7697191826900562 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:2.62613 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.8578015118909095 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.2724471385012994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5777625899999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.278394475099999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:112.0 - frontier/cluster_63/score:2.073476831629999 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:268.0 - cluster/prob_snapshot/cluster_0:0.024172304505315517 - cluster/prob_snapshot/cluster_1:0.015623317335588674 - cluster/prob_snapshot/cluster_2:0.01977276831405477 - cluster/prob_snapshot/cluster_3:0.02379241572809966 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01469251499681211 - cluster/prob_snapshot/cluster_6:0.016281141690504465 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.02008708675044624 - cluster/prob_snapshot/cluster_9:0.020815134532516613 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.019612284917526993 - cluster/prob_snapshot/cluster_12:0.0233536707258111 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.023176476023453317 - cluster/prob_snapshot/cluster_15:0.019496246446450385 - cluster/prob_snapshot/cluster_16:0.018397435670356874 - cluster/prob_snapshot/cluster_17:0.014790767583976792 - cluster/prob_snapshot/cluster_18:0.023888203811526176 - cluster/prob_snapshot/cluster_19:0.02884828888957907 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.021017504507395077 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.019674760822200345 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.016328547005916908 - cluster/prob_snapshot/cluster_26:0.017455046940081445 - cluster/prob_snapshot/cluster_27:0.018227369235017728 - cluster/prob_snapshot/cluster_28:0.02295151856459954 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.010967218517483801 - cluster/prob_snapshot/cluster_31:0.019906215287768203 - cluster/prob_snapshot/cluster_32:0.014889312861866368 - cluster/prob_snapshot/cluster_33:0.02383562962643313 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.030493310836515568 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.011175199022712034 - cluster/prob_snapshot/cluster_38:0.026344729514801206 - cluster/prob_snapshot/cluster_39:0.02627041072872532 - cluster/prob_snapshot/cluster_40:0.0321917053101246 - cluster/prob_snapshot/cluster_41:0.02323982255326333 - cluster/prob_snapshot/cluster_42:0.01681905867887134 - cluster/prob_snapshot/cluster_43:0.02423028690355303 - cluster/prob_snapshot/cluster_44:0.02198557040234934 - cluster/prob_snapshot/cluster_45:0.022031226896480325 - cluster/prob_snapshot/cluster_46:0.01902110819303851 - cluster/prob_snapshot/cluster_47:0.030672408209353103 - cluster/prob_snapshot/cluster_48:0.011452115271451999 - cluster/prob_snapshot/cluster_49:0.02901428445403855 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.019650031773060475 - cluster/prob_snapshot/cluster_52:0.014928818158437064 - cluster/prob_snapshot/cluster_53:0.020212455925937595 - cluster/prob_snapshot/cluster_54:0.021995555095967387 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.025186945641273707 - cluster/prob_snapshot/cluster_57:0.019840187933539366 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025232720328314306 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.015958866877410518
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▎      | 269/800 [8:52:18<21:04:49, 142.92s/it]
[36m(TaskRunner pid=2823680)[0m step:269 - global_seqlen/min:413060 - global_seqlen/max:477913 - global_seqlen/minmax_diff:64853 - global_seqlen/balanced_min:435097 - global_seqlen/balanced_max:435363 - global_seqlen/mean:435279.0 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.16661368066664128) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00986479315906763 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.05699376993288752) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0017900509279998539) - actor/ppo_kl:np.float64(0.005367040722507779) - actor/pg_clipfrac_lower:np.float64(0.00011146692161798688) - actor/grad_norm:np.float64(0.32674196706368375) - perf/mfu/actor:np.float64(0.21056579238984624) - perf/max_memory_allocated_gb:np.float64(83.86622047424316) - perf/max_memory_reserved_gb:np.float64(89.8515625) - perf/cpu_memory_used_gb:np.float64(104.83821105957031) - actor/lr:np.float64(1e-06) - training/global_step:269 - training/epoch:0 - critic/score/mean:0.603960394859314 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6132025122642517 - critic/rewards/max:1.4101848602294922 - critic/rewards/min:-0.08902942389249802 - critic/advantages/mean:-0.07408465445041656 - critic/advantages/max:2.474144697189331 - critic/advantages/min:-2.4747862815856934 - critic/returns/mean:-0.07408465445041656 - critic/returns/max:2.474144697189331 - critic/returns/min:-2.4747862815856934 - response_length/mean:1383.8836669921875 - response_length/max:8192.0 - response_length/min:121.0 - response_length/clip_ratio:0.03465346619486809 - response_length_non_aborted/mean:1383.8836669921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:121.0 - response_length_non_aborted/clip_ratio:0.03465346619486809 - response/aborted_ratio:0.0 - prompt_length/mean:236.40594482421875 - prompt_length/max:1168.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.819291204214096e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0141508569940925) - timing_s/agent_loop/generate_sequences/max:np.float64(32.18479889817536) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.701035331282583) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.18479889817536) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:205 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.50732257217169 - timing_s/reward:0.00014057476073503494 - timing_s/old_log_prob:14.531566049903631 - timing_s/ref:30.5789549164474 - timing_s/adv:0.08540594577789307 - timing_s/update_actor:24.65525033697486 - timing_s/update_weights:50.100122350268066 - timing_s/step:154.86487223673612 - timing_s/stop_profile:6.930623203516006e-05 - timing_per_token_ms/adv:6.523551572791585e-05 - timing_per_token_ms/update_actor:0.01883238873457628 - timing_per_token_ms/gen:0.03086031255504194 - timing_per_token_ms/ref:0.023357084524102158 - perf/total_num_tokens:1741116 - perf/time_per_step:154.86487223673612 - perf/throughput:2810.701960445913 - frontier/active_count:47.0 - frontier/completed_count:17.0 - frontier/blacklisted_count:1415.0 - frontier/mean_score:2.6959742017793222 - frontier/mean_frontier_pct:0.5777339175556644 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.1406185504198993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:80.0 - frontier/cluster_2/score:2.569003007 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:3.063882699929509 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.7807441683012997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:2.609841244490056 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.7931036690999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.683706392999999 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:3.034256476694303 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:3.011234221239156 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.5330755385699995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.3903110989592995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.072594475099999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7481519899999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.7307269989480565 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.556269254331509 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:96.0 - frontier/cluster_25/score:2.1215080099999994 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:2.2678699999999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.3682149929999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:2.9820063266387 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:1.4249303335949626 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.5863412810999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:160.0 - frontier/cluster_32/score:1.934513615228543 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:3.0968756226470893 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.6733185455724766 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:3.8960116149144763 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:128.0 - frontier/cluster_39/score:3.2892524384900566 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.1825497789999995 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:3.0194646026900562 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:2.1852383861806994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:3.103706392999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.856504237400999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:2.029939801942547 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.9851530989592994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.538803427883039 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.5530562999999997 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:2.62613 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.8578015118909095 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.1907129969509094 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.7044338129999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.194876132569999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:1.7514337821409993 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:269.0 - cluster/prob_snapshot/cluster_0:0.024785723558269954 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.020274540613457494 - cluster/prob_snapshot/cluster_3:0.024180125155684812 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01506536602691896 - cluster/prob_snapshot/cluster_6:0.01405361140645885 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.020596835489064137 - cluster/prob_snapshot/cluster_9:0.022043140324267078 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02117977834639179 - cluster/prob_snapshot/cluster_12:0.023946315360768702 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.023764623999580358 - cluster/prob_snapshot/cluster_15:0.019991001467789704 - cluster/prob_snapshot/cluster_16:0.018864306239657326 - cluster/prob_snapshot/cluster_17:0.015166111963816002 - cluster/prob_snapshot/cluster_18:0.02424887837980644 - cluster/prob_snapshot/cluster_19:0.029580370104512886 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.02155086439898327 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.020174045991638998 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01674291551754512 - cluster/prob_snapshot/cluster_26:0.017898002569773492 - cluster/prob_snapshot/cluster_27:0.018689924039071958 - cluster/prob_snapshot/cluster_28:0.023533957809424823 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.011245532932853668 - cluster/prob_snapshot/cluster_31:0.020411374062639865 - cluster/prob_snapshot/cluster_32:0.01526715801903208 - cluster/prob_snapshot/cluster_33:0.024440504902135708 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.028989785467532458 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.011458791346297976 - cluster/prob_snapshot/cluster_38:0.03074727647334578 - cluster/prob_snapshot/cluster_39:0.025958740402548988 - cluster/prob_snapshot/cluster_40:0.03300863219353295 - cluster/prob_snapshot/cluster_41:0.023829578070298008 - cluster/prob_snapshot/cluster_42:0.017245874874410726 - cluster/prob_snapshot/cluster_43:0.024494413259021203 - cluster/prob_snapshot/cluster_44:0.02254349684134098 - cluster/prob_snapshot/cluster_45:0.022590311957455433 - cluster/prob_snapshot/cluster_46:0.016020260328701918 - cluster/prob_snapshot/cluster_47:0.031450779985675706 - cluster/prob_snapshot/cluster_48:0.011742734881286647 - cluster/prob_snapshot/cluster_49:0.0279281937880804 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.02014868939497256 - cluster/prob_snapshot/cluster_52:0.015307665839032172 - cluster/prob_snapshot/cluster_53:0.020725386150246387 - cluster/prob_snapshot/cluster_54:0.022553734915901742 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.025181068323509317 - cluster/prob_snapshot/cluster_57:0.021343358893964968 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025213923739387994 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.013822294193912227
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▍      | 270/800 [8:54:29<20:30:33, 139.31s/it]
[36m(TaskRunner pid=2823680)[0m step:270 - global_seqlen/min:425664 - global_seqlen/max:490772 - global_seqlen/minmax_diff:65108 - global_seqlen/balanced_min:459166 - global_seqlen/balanced_max:459315 - global_seqlen/mean:459268.0 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.1929987418310096) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.015031363815069199 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.08558358519803733) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0010079257458528446) - actor/ppo_kl:np.float64(-0.0004943596818698381) - actor/pg_clipfrac_lower:np.float64(9.247884626499096e-05) - actor/grad_norm:np.float64(0.4868007277448972) - perf/mfu/actor:np.float64(0.23399447943557417) - perf/max_memory_allocated_gb:np.float64(86.25694608688354) - perf/max_memory_reserved_gb:np.float64(92.296875) - perf/cpu_memory_used_gb:np.float64(112.90226745605469) - actor/lr:np.float64(1e-06) - training/global_step:270 - training/epoch:0 - critic/score/mean:0.5960526466369629 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6088433265686035 - critic/rewards/max:1.51688551902771 - critic/rewards/min:-0.09749110788106918 - critic/advantages/mean:-0.025838905945420265 - critic/advantages/max:2.4745006561279297 - critic/advantages/min:-2.4748220443725586 - critic/returns/mean:-0.025838905945420265 - critic/returns/max:2.4745006561279297 - critic/returns/min:-2.4748220443725586 - response_length/mean:1430.8289794921875 - response_length/max:8192.0 - response_length/min:127.0 - response_length/clip_ratio:0.03947368264198303 - response_length_non_aborted/mean:1430.8289794921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:127.0 - response_length_non_aborted/clip_ratio:0.03947368264198303 - response/aborted_ratio:0.0 - prompt_length/mean:235.6631622314453 - prompt_length/max:388.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.527934551239014e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5504908403381705) - timing_s/agent_loop/generate_sequences/max:np.float64(34.00234964303672) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.446883555780914) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.00234964303672) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:181 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.605441361665726 - timing_s/reward:0.00015337113291025162 - timing_s/old_log_prob:11.40570289734751 - timing_s/ref:25.615901017561555 - timing_s/adv:0.09008750040084124 - timing_s/update_actor:23.515370082110167 - timing_s/update_weights:32.96492109447718 - timing_s/step:130.62981041893363 - timing_s/stop_profile:6.183981895446777e-05 - timing_per_token_ms/adv:7.112916068644129e-05 - timing_per_token_ms/update_actor:0.018566710472920717 - timing_per_token_ms/gen:0.03366234273623656 - timing_per_token_ms/ref:0.020225198074083725 - perf/total_num_tokens:1837072 - perf/time_per_step:130.62981041893363 - perf/throughput:3515.7977993469794 - frontier/active_count:45.0 - frontier/completed_count:19.0 - frontier/blacklisted_count:1448.0 - frontier/mean_score:2.6185210597003286 - frontier/mean_frontier_pct:0.5803262409597538 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.1406185504198993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:96.0 - frontier/cluster_2/score:2.6983021048999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:3.063882699929509 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.7807441683012997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:2.7268888711430392 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.8551725683699996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.7785944750999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:3.023979533686012 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:3.011234221239156 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.5330755385699995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.3903110989592995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:1.9217099899999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.072594475099999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7481519899999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.7307269989480565 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.556269254331509 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:112.0 - frontier/cluster_25/score:1.7850556069999997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:2.2678699999999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.5577504950999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:1.4249303335949626 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.1104388967699994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:160.0 - frontier/cluster_32/score:1.934513615228543 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:3.067812935852962 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.4713229819007334 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:3.0272081304401333 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:128.0 - frontier/cluster_39/score:3.2892524384900566 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:3.0194646026900562 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:1.8296668703264896 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:3.103706392999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.856504237400999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:2.029939801942547 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.9851530989592994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.377162399518127 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.6871394099999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:2.62613 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.8578015118909095 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:3.1907129969509094 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.7044338129999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.194876132569999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:1.7514337821409993 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:270.0 - cluster/prob_snapshot/cluster_0:0.026653031139131485 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.02289928842681908 - cluster/prob_snapshot/cluster_3:0.02600180814602566 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.016200361022097674 - cluster/prob_snapshot/cluster_6:0.015112382801857824 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.023141891582411134 - cluster/prob_snapshot/cluster_9:0.02423057819682852 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.023580694019003826 - cluster/prob_snapshot/cluster_12:0.025663167742753616 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02555500395142703 - cluster/prob_snapshot/cluster_15:0.021497084132758446 - cluster/prob_snapshot/cluster_16:0.020285505905916507 - cluster/prob_snapshot/cluster_17:0.016308696959394204 - cluster/prob_snapshot/cluster_18:0.02607574110259727 - cluster/prob_snapshot/cluster_19:0.031808896909913204 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.023174464063997417 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.021693918870406236 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01514897206223634 - cluster/prob_snapshot/cluster_26:0.019246402821323384 - cluster/prob_snapshot/cluster_27:0.021706489501220935 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012092749228435212 - cluster/prob_snapshot/cluster_31:0.017910355151320285 - cluster/prob_snapshot/cluster_32:0.01641735562533264 - cluster/prob_snapshot/cluster_33:0.02603516230819793 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.029459572388443336 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.012322074110593691 - cluster/prob_snapshot/cluster_38:0.025690567405731404 - cluster/prob_snapshot/cluster_39:0.027914420761419242 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02562485153386217 - cluster/prob_snapshot/cluster_42:0.015527567988964836 - cluster/prob_snapshot/cluster_43:0.026339774095867317 - cluster/prob_snapshot/cluster_44:0.02424187947891052 - cluster/prob_snapshot/cluster_45:0.024292221553639432 - cluster/prob_snapshot/cluster_46:0.017227195179275567 - cluster/prob_snapshot/cluster_47:0.03382021978650271 - cluster/prob_snapshot/cluster_48:0.012627409392092183 - cluster/prob_snapshot/cluster_49:0.028660473454895093 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.022804555605882723 - cluster/prob_snapshot/cluster_52:0.01646091522468436 - cluster/prob_snapshot/cluster_53:0.022286795910330833 - cluster/prob_snapshot/cluster_54:0.024252888869837973 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.02707816040008871 - cluster/prob_snapshot/cluster_57:0.022951325503051567 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.027113491078269027 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.014863638606251514
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▍      | 271/800 [8:56:38<20:01:07, 136.23s/it]
[36m(TaskRunner pid=2823680)[0m step:271 - global_seqlen/min:444746 - global_seqlen/max:527208 - global_seqlen/minmax_diff:82462 - global_seqlen/balanced_min:480306 - global_seqlen/balanced_max:480486 - global_seqlen/mean:480384.25 - frontier/skipped_zero_acc_count:25.0 - actor/entropy:np.float64(0.17989780673255715) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011181825771927834 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0250241688117967) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007945540531251866) - actor/ppo_kl:np.float64(-8.081776799265036e-05) - actor/pg_clipfrac_lower:np.float64(2.6164500590871634e-05) - actor/grad_norm:np.float64(0.3411685193960483) - perf/mfu/actor:np.float64(0.20739505830229987) - perf/max_memory_allocated_gb:np.float64(89.63991594314575) - perf/max_memory_reserved_gb:np.float64(95.935546875) - perf/cpu_memory_used_gb:np.float64(113.07434844970703) - actor/lr:np.float64(1e-06) - training/global_step:271 - training/epoch:0 - critic/score/mean:0.5631067752838135 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5734027624130249 - critic/rewards/max:1.6194769144058228 - critic/rewards/min:-0.37037983536720276 - critic/advantages/mean:-0.06389471888542175 - critic/advantages/max:2.4708971977233887 - critic/advantages/min:-2.4746897220611572 - critic/returns/mean:-0.06389471888542175 - critic/returns/max:2.4708971977233887 - critic/returns/min:-2.4746897220611572 - response_length/mean:1573.9879150390625 - response_length/max:8192.0 - response_length/min:220.0 - response_length/clip_ratio:0.04247572645545006 - response_length_non_aborted/mean:1573.9879150390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:220.0 - response_length_non_aborted/clip_ratio:0.04247572645545006 - response/aborted_ratio:0.0 - prompt_length/mean:248.86407470703125 - prompt_length/max:1168.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010171066969633102 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.600498728454113) - timing_s/agent_loop/generate_sequences/max:np.float64(36.2343921829015) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.798082197888107) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(36.2343921829015) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:186 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:38.13182958308607 - timing_s/reward:0.00014517363160848618 - timing_s/old_log_prob:12.045486737042665 - timing_s/ref:19.670152822509408 - timing_s/adv:0.14029527641832829 - timing_s/update_actor:27.67981757596135 - timing_s/update_weights:30.68275175523013 - timing_s/step:128.8213108535856 - timing_s/stop_profile:6.421748548746109e-05 - timing_per_token_ms/adv:9.340377783288502e-05 - timing_per_token_ms/update_actor:0.018428272122368628 - timing_per_token_ms/gen:0.0294007935312769 - timing_per_token_ms/ref:0.013095712350957975 - perf/total_num_tokens:1921537 - perf/time_per_step:128.8213108535856 - perf/throughput:3729.074380759797 - frontier/active_count:45.0 - frontier/completed_count:19.0 - frontier/blacklisted_count:1473.0 - frontier/mean_score:2.619210832547035 - frontier/mean_frontier_pct:0.5977857607586291 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.1406185504198993 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:96.0 - frontier/cluster_2/score:2.6983021048999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:3.044717889950656 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.7807441683012997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:2.808822209800127 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:2.8986207978589995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.2450161325699995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:3.023979533686012 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:3.0078639548674087 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.5330755385699995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.3903110989592995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.2451969929999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.072594475099999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.5237063929999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.7307269989480565 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.556269254331509 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:2.2678699999999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.6904253465699997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:1.4249303335949626 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.1104388967699994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:160.0 - frontier/cluster_32/score:1.934513615228543 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:3.647469055097073 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.4713229819007334 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:3.0272081304401333 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:128.0 - frontier/cluster_39/score:3.2892524384900566 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:3.0136252218830393 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:176.0 - frontier/cluster_42/score:1.5807668092285427 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:3.103706392999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.899552966180699 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:2.029939801942547 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.9851530989592994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.377162399518127 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.6871394099999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.7382909999999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.9004610583236365 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:3.1334990978656365 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.7044338129999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.194876132569999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:1.7514337821409993 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:271.0 - cluster/prob_snapshot/cluster_0:0.026646012025994904 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.02289325786709115 - cluster/prob_snapshot/cluster_3:0.025832360157377287 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01619609463429833 - cluster/prob_snapshot/cluster_6:0.015108402934648883 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.023830945776974147 - cluster/prob_snapshot/cluster_9:0.02459282571206342 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.01904743473511445 - cluster/prob_snapshot/cluster_12:0.025656409311530952 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.025519679587716206 - cluster/prob_snapshot/cluster_15:0.02149142284549822 - cluster/prob_snapshot/cluster_16:0.020280163689481795 - cluster/prob_snapshot/cluster_17:0.019048969212834502 - cluster/prob_snapshot/cluster_18:0.026068874019601576 - cluster/prob_snapshot/cluster_19:0.02989625177861848 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.023168361036379446 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02168820574644175 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.013146783719440327 - cluster/prob_snapshot/cluster_26:0.019241334254143542 - cluster/prob_snapshot/cluster_27:0.022826428930747072 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012089564585963457 - cluster/prob_snapshot/cluster_31:0.017905638434168408 - cluster/prob_snapshot/cluster_32:0.016413032091700155 - cluster/prob_snapshot/cluster_33:0.030946293778200653 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.029451814168732028 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.012318829075092548 - cluster/prob_snapshot/cluster_38:0.02568380175876906 - cluster/prob_snapshot/cluster_39:0.02790706945955629 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02556856001929964 - cluster/prob_snapshot/cluster_42:0.013411731075512422 - cluster/prob_snapshot/cluster_43:0.026332837479412487 - cluster/prob_snapshot/cluster_44:0.024600734526174874 - cluster/prob_snapshot/cluster_45:0.02428582416301869 - cluster/prob_snapshot/cluster_46:0.017222658373260594 - cluster/prob_snapshot/cluster_47:0.033811313184182464 - cluster/prob_snapshot/cluster_48:0.012624083946116364 - cluster/prob_snapshot/cluster_49:0.028652925679008828 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.022798549994176066 - cluster/prob_snapshot/cluster_52:0.016456580219570287 - cluster/prob_snapshot/cluster_53:0.023232536439969212 - cluster/prob_snapshot/cluster_54:0.024608439070287112 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.02658560831400829 - cluster/prob_snapshot/cluster_57:0.022945281239286616 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02710635070236159 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.014859724246175397
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▍      | 272/800 [8:58:56<20:02:34, 136.66s/it]
[36m(TaskRunner pid=2823680)[0m step:272 - global_seqlen/min:340589 - global_seqlen/max:632639 - global_seqlen/minmax_diff:292050 - global_seqlen/balanced_min:487631 - global_seqlen/balanced_max:487751 - global_seqlen/mean:487676.75 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.19258678561829506) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01441585086286068 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0532508953474462) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009592738246515472) - actor/ppo_kl:np.float64(-8.053156807847622e-05) - actor/pg_clipfrac_lower:np.float64(6.853580605898156e-05) - actor/grad_norm:np.float64(0.4771323812504609) - perf/mfu/actor:np.float64(0.2267846548075403) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.1935806274414) - actor/lr:np.float64(1e-06) - training/global_step:272 - training/epoch:0 - critic/score/mean:0.5492021441459656 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5705239176750183 - critic/rewards/max:1.8410348892211914 - critic/rewards/min:-0.1437099128961563 - critic/advantages/mean:-0.037996355444192886 - critic/advantages/max:2.474503993988037 - critic/advantages/min:-2.4746310710906982 - critic/returns/mean:-0.037996355444192886 - critic/returns/max:2.474503993988037 - critic/returns/min:-2.4746310710906982 - response_length/mean:1478.8603515625 - response_length/max:8192.0 - response_length/min:133.0 - response_length/clip_ratio:0.04388297721743584 - response_length_non_aborted/mean:1478.8603515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:133.0 - response_length_non_aborted/clip_ratio:0.04388297721743584 - response/aborted_ratio:0.0 - prompt_length/mean:245.91488647460938 - prompt_length/max:556.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.015947580337524e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.041400427930057) - timing_s/agent_loop/generate_sequences/max:np.float64(35.9294202812016) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.508503840510457) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.9294202812016) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.64781902637333 - timing_s/reward:0.00017538294196128845 - timing_s/old_log_prob:11.97131844703108 - timing_s/ref:27.01600520964712 - timing_s/adv:0.06958286836743355 - timing_s/update_actor:25.8756696684286 - timing_s/update_weights:33.80787899065763 - timing_s/step:136.78084562439471 - timing_s/stop_profile:5.220342427492142e-05 - timing_per_token_ms/adv:5.3647806696550464e-05 - timing_per_token_ms/update_actor:0.019949923840238668 - timing_per_token_ms/gen:0.03385281671425518 - timing_per_token_ms/ref:0.02082911295847757 - perf/total_num_tokens:1950707 - perf/time_per_step:136.78084562439471 - perf/throughput:3565.3877395902236 - frontier/active_count:43.0 - frontier/completed_count:21.0 - frontier/blacklisted_count:1507.0 - frontier/mean_score:2.604616449053594 - frontier/mean_frontier_pct:0.5977877555120936 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:96.0 - frontier/cluster_2/score:2.6983021048999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:3.044717889950656 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.7807441683012997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:2.808822209800127 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:2.3290345585012995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.2450161325699995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:3.016785673580208 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:3.0078639548674087 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:2.5330755385699995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.5732177692715092 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.2451969929999995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.050816132569999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.5237063929999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.2115088992636394 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.556269254331509 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:2.4875089999999993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.6904253465699997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.1104388967699994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:176.0 - frontier/cluster_32/score:1.6541595306599801 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:3.647469055097073 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.4713229819007334 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:3.019045691308093 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.6024767069430395 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:3.0136252218830393 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:176.0 - frontier/cluster_42/score:1.5807668092285427 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:3.103706392999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.899552966180699 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:2.3209578613597825 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:3.6896071692715093 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.377162399518127 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.6871394099999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.8168036999999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:2.9004610583236365 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:3.1334990978656365 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.7044338129999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.136413292798999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:272.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.024092304171948832 - cluster/prob_snapshot/cluster_3:0.027185343475534956 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017044373526587255 - cluster/prob_snapshot/cluster_6:0.01589971340763896 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.025079103974511432 - cluster/prob_snapshot/cluster_9:0.020795228565584724 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.020045054050318485 - cluster/prob_snapshot/cluster_12:0.02693594536263593 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.026856286098176623 - cluster/prob_snapshot/cluster_15:0.022617047310947137 - cluster/prob_snapshot/cluster_16:0.022975464861911897 - cluster/prob_snapshot/cluster_17:0.020046668897108367 - cluster/prob_snapshot/cluster_18:0.027239792796029037 - cluster/prob_snapshot/cluster_19:0.0314620835371371 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.019745878337075952 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02282413681882295 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0138353533643146 - cluster/prob_snapshot/cluster_26:0.02221019779424635 - cluster/prob_snapshot/cluster_27:0.024021975035255546 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.018843455573400038 - cluster/prob_snapshot/cluster_32:0.01476947836538315 - cluster/prob_snapshot/cluster_33:0.032567122033367056 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03099436827594527 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.012964034164224755 - cluster/prob_snapshot/cluster_38:0.02695612436128672 - cluster/prob_snapshot/cluster_39:0.023236708858590586 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02690772666782386 - cluster/prob_snapshot/cluster_42:0.014114177476160138 - cluster/prob_snapshot/cluster_43:0.027712033557987886 - cluster/prob_snapshot/cluster_44:0.025889210810399896 - cluster/prob_snapshot/cluster_45:0.025557806852950758 - cluster/prob_snapshot/cluster_46:0.020723114237139153 - cluster/prob_snapshot/cluster_47:0.03294336020998904 - cluster/prob_snapshot/cluster_48:0.013285276918111981 - cluster/prob_snapshot/cluster_49:0.03015363758546772 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.02399263592486075 - cluster/prob_snapshot/cluster_52:0.017318502180063065 - cluster/prob_snapshot/cluster_53:0.0251503682296486 - cluster/prob_snapshot/cluster_54:0.025897318884022965 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.027978043396702606 - cluster/prob_snapshot/cluster_57:0.024147052295359676 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.028004063341105206 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.01898242841422295
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▍      | 273/800 [9:01:04<19:38:45, 134.20s/it]
[36m(TaskRunner pid=2823680)[0m step:273 - global_seqlen/min:416872 - global_seqlen/max:463722 - global_seqlen/minmax_diff:46850 - global_seqlen/balanced_min:432502 - global_seqlen/balanced_max:432612 - global_seqlen/mean:432557.0 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.1751857681893835) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013155578635632992 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.040589062293292955) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000985144209135244) - actor/ppo_kl:np.float64(0.0010717600856718062) - actor/pg_clipfrac_lower:np.float64(3.590551665076739e-05) - actor/grad_norm:np.float64(0.4432456103655008) - perf/mfu/actor:np.float64(0.20429220871694834) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.56603622436523) - actor/lr:np.float64(1e-06) - training/global_step:273 - training/epoch:0 - critic/score/mean:0.6126237511634827 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6196050643920898 - critic/rewards/max:1.3602516651153564 - critic/rewards/min:-0.11734101921319962 - critic/advantages/mean:-0.01367966365069151 - critic/advantages/max:2.4730145931243896 - critic/advantages/min:-2.474853038787842 - critic/returns/mean:-0.01367966365069151 - critic/returns/max:2.4730145931243896 - critic/returns/min:-2.474853038787842 - response_length/mean:1438.5667724609375 - response_length/max:8192.0 - response_length/min:113.0 - response_length/clip_ratio:0.029702970758080482 - response_length_non_aborted/mean:1438.5667724609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:113.0 - response_length_non_aborted/clip_ratio:0.029702970758080482 - response/aborted_ratio:0.0 - prompt_length/mean:242.16831970214844 - prompt_length/max:543.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.229592978954315e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.288141873665154) - timing_s/agent_loop/generate_sequences/max:np.float64(30.926381072960794) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.142679790745206) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.926381072960794) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:175 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.50066489353776 - timing_s/reward:0.00012963265180587769 - timing_s/old_log_prob:12.203855881467462 - timing_s/ref:26.65371085330844 - timing_s/adv:0.07540340535342693 - timing_s/update_actor:25.37901295349002 - timing_s/update_weights:31.0186973977834 - timing_s/step:128.23991347756237 - timing_s/stop_profile:5.131866782903671e-05 - timing_per_token_ms/adv:5.5523945168844765e-05 - timing_per_token_ms/update_actor:0.018688054167634992 - timing_per_token_ms/gen:0.027960880425837872 - timing_per_token_ms/ref:0.019626688914495836 - perf/total_num_tokens:1730228 - perf/time_per_step:128.23991347756237 - perf/throughput:3373.0294123731046 - frontier/active_count:42.0 - frontier/completed_count:22.0 - frontier/blacklisted_count:1534.0 - frontier/mean_score:2.616776685917926 - frontier/mean_frontier_pct:0.6107902595555775 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:96.0 - frontier/cluster_2/score:2.6983021048999998 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:3.031302522965459 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.7807441683012997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:2.808822209800127 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:2.3290345585012995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.2450161325699995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:3.0117499715061453 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:3.0055047684071856 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.6731528769989996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.5732177692715092 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.4716378950999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.050816132569999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.3665944750999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.448056229484547 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.689388478032056 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:2.4875089999999993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.7832977425989993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.1104388967699994 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.457911671461986 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:3.647469055097073 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.329926087330513 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:3.019045691308093 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.721733694860127 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:3.0136252218830393 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:176.0 - frontier/cluster_42/score:1.5807668092285427 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:3.103706392999999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.899552966180699 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.862436217629999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:2.3209578613597825 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.4827250184900564 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.8640136796626887 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.7809975869999994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.8168036999999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:3.1334990978656365 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.7044338129999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.136413292798999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:273.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.02455130717024425 - cluster/prob_snapshot/cluster_3:0.02758121087780106 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01736910039774637 - cluster/prob_snapshot/cluster_6:0.0162026323843403 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.02555690733597929 - cluster/prob_snapshot/cluster_9:0.021191416169465113 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.020426949440830885 - cluster/prob_snapshot/cluster_12:0.027403306151726134 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0273464823070778 - cluster/prob_snapshot/cluster_15:0.02432247941290356 - cluster/prob_snapshot/cluster_16:0.02341319000366597 - cluster/prob_snapshot/cluster_17:0.022488935195959063 - cluster/prob_snapshot/cluster_18:0.02775876127978581 - cluster/prob_snapshot/cluster_19:0.030631964792130085 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.02227437037200566 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.024470203875383966 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014098942460292018 - cluster/prob_snapshot/cluster_26:0.022633343181567296 - cluster/prob_snapshot/cluster_27:0.025324665351853893 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.01920245828832124 - cluster/prob_snapshot/cluster_32:0.013265244543280659 - cluster/prob_snapshot/cluster_33:0.03318758600196384 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03029832652014732 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.013211023016304403 - cluster/prob_snapshot/cluster_38:0.027469688436185826 - cluster/prob_snapshot/cluster_39:0.02476450648604867 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027420368677060356 - cluster/prob_snapshot/cluster_42:0.014383078687675396 - cluster/prob_snapshot/cluster_43:0.028239999102553346 - cluster/prob_snapshot/cluster_44:0.026382448206900624 - cluster/prob_snapshot/cluster_45:0.02604473039051004 - cluster/prob_snapshot/cluster_46:0.021117927929552197 - cluster/prob_snapshot/cluster_47:0.031688613207234935 - cluster/prob_snapshot/cluster_48:0.013538386039393054 - cluster/prob_snapshot/cluster_49:0.03515788190920165 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.025303736699518093 - cluster/prob_snapshot/cluster_52:0.017648451709585032 - cluster/prob_snapshot/cluster_53:0.025629529307113475 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.028511076921178775 - cluster/prob_snapshot/cluster_57:0.024607098346765218 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.028537592593694844 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.019344078819052275
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▍      | 274/800 [9:03:08<19:09:18, 131.10s/it]
[36m(TaskRunner pid=2823680)[0m step:274 - global_seqlen/min:406340 - global_seqlen/max:572752 - global_seqlen/minmax_diff:166412 - global_seqlen/balanced_min:479221 - global_seqlen/balanced_max:479330 - global_seqlen/mean:479276.0 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.1953914766411551) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012181929312646389 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.005091741782962345) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006633651024788957) - actor/ppo_kl:np.float64(0.00013238480855580872) - actor/pg_clipfrac_lower:np.float64(8.373608145782121e-06) - actor/grad_norm:np.float64(0.3985537182201039) - perf/mfu/actor:np.float64(0.29136403775731234) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.32109069824219) - actor/lr:np.float64(1e-06) - training/global_step:274 - training/epoch:0 - critic/score/mean:0.6548295617103577 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6682878732681274 - critic/rewards/max:1.146996021270752 - critic/rewards/min:-0.06987544894218445 - critic/advantages/mean:-0.12039005011320114 - critic/advantages/max:2.4730424880981445 - critic/advantages/min:-2.4748616218566895 - critic/returns/mean:-0.12039005011320114 - critic/returns/max:2.4730424880981445 - critic/returns/min:-2.4748616218566895 - response_length/mean:1395.4644775390625 - response_length/max:8192.0 - response_length/min:112.0 - response_length/clip_ratio:0.04119318351149559 - response_length_non_aborted/mean:1395.4644775390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:112.0 - response_length_non_aborted/clip_ratio:0.04119318351149559 - response/aborted_ratio:0.0 - prompt_length/mean:245.98863220214844 - prompt_length/max:382.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.344052523374557e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9577325945720077) - timing_s/agent_loop/generate_sequences/max:np.float64(34.85893595404923) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.419911592638528) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.85893595404923) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:212 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.39017285685986 - timing_s/reward:0.0001332433894276619 - timing_s/old_log_prob:10.663144086487591 - timing_s/ref:23.916467062197626 - timing_s/adv:0.0652196342125535 - timing_s/update_actor:19.80675884243101 - timing_s/update_weights:32.353130146861076 - timing_s/step:123.59367135819048 - timing_s/stop_profile:5.960837006568909e-05 - timing_per_token_ms/adv:5.6438727648774253e-05 - timing_per_token_ms/update_actor:0.01714005730651196 - timing_per_token_ms/gen:0.037041850126128845 - timing_per_token_ms/ref:0.02069645110926487 - perf/total_num_tokens:1917104 - perf/time_per_step:123.59367135819048 - perf/throughput:3877.836095757654 - frontier/active_count:41.0 - frontier/completed_count:23.0 - frontier/blacklisted_count:1572.0 - frontier/mean_score:2.6134444868431754 - frontier/mean_frontier_pct:0.6157418359943493 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:96.0 - frontier/cluster_2/score:2.7888114734299996 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:3.031302522965459 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.7807441683012997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:2.808822209800127 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.9303241909509097 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.2450161325699995 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:3.0117499715061453 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:3.0038533378850296 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.6731528769989996 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.5732177692715092 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.4716378950999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.050816132569999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.3665944750999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.6136393606391826 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.689388478032056 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:2.4875089999999993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.7832977425989993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.3773072277389993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.457911671461986 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:3.453228338567951 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.830948261131359 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:3.013331983915665 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.721733694860127 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:3.0095376553181272 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:192.0 - frontier/cluster_42/score:1.40653676645998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:3.072594475099999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.899552966180699 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.903705352340999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.5246705029518477 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.337907512943039 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.8640136796626887 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.8466983108999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.8168036999999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.7044338129999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.136413292798999 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:274.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.026026874638933047 - cluster/prob_snapshot/cluster_3:0.028289947710545296 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017815423122954324 - cluster/prob_snapshot/cluster_6:0.016618981122946228 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.026213626928178864 - cluster/prob_snapshot/cluster_9:0.018014952322534698 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.020951847767937088 - cluster/prob_snapshot/cluster_12:0.02810747147988172 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02803377532103969 - cluster/prob_snapshot/cluster_15:0.024947478695928605 - cluster/prob_snapshot/cluster_16:0.024014823855100426 - cluster/prob_snapshot/cluster_17:0.023066819059477287 - cluster/prob_snapshot/cluster_18:0.02847206051227793 - cluster/prob_snapshot/cluster_19:0.03141909490776184 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.02439206258251382 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.025098999139960405 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014461234016983357 - cluster/prob_snapshot/cluster_26:0.02321493780524019 - cluster/prob_snapshot/cluster_27:0.0259754171695062 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.02218646824429936 - cluster/prob_snapshot/cluster_32:0.013606113094876755 - cluster/prob_snapshot/cluster_33:0.032227614495926614 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03575272516290698 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.013550498271809694 - cluster/prob_snapshot/cluster_38:0.028122235776088935 - cluster/prob_snapshot/cluster_39:0.025400864257618543 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.028086824807765007 - cluster/prob_snapshot/cluster_42:0.013126651422829863 - cluster/prob_snapshot/cluster_43:0.02867530917080948 - cluster/prob_snapshot/cluster_44:0.027060381197770402 - cluster/prob_snapshot/cluster_45:0.027099133775732816 - cluster/prob_snapshot/cluster_46:0.023561751416678947 - cluster/prob_snapshot/cluster_47:0.031151370834283067 - cluster/prob_snapshot/cluster_48:0.013886273334281583 - cluster/prob_snapshot/cluster_49:0.03606131163825604 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.026567109601543473 - cluster/prob_snapshot/cluster_52:0.01810195274777035 - cluster/prob_snapshot/cluster_53:0.026288115019913676 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.025239411301500248 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.029270904959080496 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.019841151308556515
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_275
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 275}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Updated best checkpoint at step 275: val-core/aime2025/acc/best@16/mean=0.292600
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▍      | 275/800 [9:08:14<26:45:17, 183.46s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:275 - global_seqlen/min:371135 - global_seqlen/max:584848 - global_seqlen/minmax_diff:213713 - global_seqlen/balanced_min:463768 - global_seqlen/balanced_max:463883 - global_seqlen/mean:463810.25 - frontier/skipped_zero_acc_count:25.0 - actor/entropy:np.float64(0.1924986134713086) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00916930940002203 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.017872144142529578) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008736824940150165) - actor/ppo_kl:np.float64(0.00041141982090018805) - actor/pg_clipfrac_lower:np.float64(1.2223003138317905e-05) - actor/grad_norm:np.float64(0.4034877121448517) - perf/mfu/actor:np.float64(0.21421339017817004) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.35020446777344) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.11458333333333333) - val-aux/aime2024/reward/std@16:np.float64(0.13256322152553768) - val-aux/aime2024/reward/best@2/mean:np.float64(0.16296666666666668) - val-aux/aime2024/reward/best@2/std:np.float64(0.13599034552077963) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.06813333333333332) - val-aux/aime2024/reward/worst@2/std:np.float64(0.08487138189136488) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.1144) - val-aux/aime2024/reward/maj@2/std:np.float64(0.13190730922462585) - val-aux/aime2024/reward/best@4/mean:np.float64(0.20953333333333335) - val-aux/aime2024/reward/best@4/std:np.float64(0.13353543066842033) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.035666666666666666) - val-aux/aime2024/reward/worst@4/std:np.float64(0.05190491585718613) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.1350333333333333) - val-aux/aime2024/reward/maj@4/std:np.float64(0.10897248680644103) - val-aux/aime2024/reward/best@8/mean:np.float64(0.2658) - val-aux/aime2024/reward/best@8/std:np.float64(0.13023052260937654) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.014166666666666668) - val-aux/aime2024/reward/worst@8/std:np.float64(0.032067529950910514) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.1447) - val-aux/aime2024/reward/maj@8/std:np.float64(0.07884962682517418) - val-aux/aime2024/reward/best@16/mean:np.float64(0.32883333333333337) - val-aux/aime2024/reward/best@16/std:np.float64(0.110938581341787) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.004066666666666667) - val-aux/aime2024/reward/worst@16/std:np.float64(0.014078852070363154) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.14833333333333334) - val-aux/aime2024/reward/maj@16/std:np.float64(0.05493771280520521) - val-aux/aime2024/score/mean@16:np.float64(0.11458333333333333) - val-aux/aime2024/score/std@16:np.float64(0.13256322152553768) - val-aux/aime2024/score/best@2/mean:np.float64(0.16296666666666668) - val-aux/aime2024/score/best@2/std:np.float64(0.13599034552077963) - val-aux/aime2024/score/worst@2/mean:np.float64(0.06813333333333332) - val-aux/aime2024/score/worst@2/std:np.float64(0.08487138189136488) - val-aux/aime2024/score/maj@2/mean:np.float64(0.1144) - val-aux/aime2024/score/maj@2/std:np.float64(0.13190730922462585) - val-aux/aime2024/score/best@4/mean:np.float64(0.20953333333333335) - val-aux/aime2024/score/best@4/std:np.float64(0.13353543066842033) - val-aux/aime2024/score/worst@4/mean:np.float64(0.035666666666666666) - val-aux/aime2024/score/worst@4/std:np.float64(0.05190491585718613) - val-aux/aime2024/score/maj@4/mean:np.float64(0.1350333333333333) - val-aux/aime2024/score/maj@4/std:np.float64(0.10897248680644103) - val-aux/aime2024/score/best@8/mean:np.float64(0.2658) - val-aux/aime2024/score/best@8/std:np.float64(0.13023052260937654) - val-aux/aime2024/score/worst@8/mean:np.float64(0.014166666666666668) - val-aux/aime2024/score/worst@8/std:np.float64(0.032067529950910514) - val-aux/aime2024/score/maj@8/mean:np.float64(0.1447) - val-aux/aime2024/score/maj@8/std:np.float64(0.07884962682517418) - val-aux/aime2024/score/best@16/mean:np.float64(0.32883333333333337) - val-aux/aime2024/score/best@16/std:np.float64(0.110938581341787) - val-aux/aime2024/score/worst@16/mean:np.float64(0.004066666666666667) - val-aux/aime2024/score/worst@16/std:np.float64(0.014078852070363154) - val-aux/aime2024/score/maj@16/mean:np.float64(0.14833333333333334) - val-aux/aime2024/score/maj@16/std:np.float64(0.05493771280520521) - val-core/aime2024/acc/mean@16:np.float64(0.11458333333333333) - val-aux/aime2024/acc/std@16:np.float64(0.13256322152553768) - val-aux/aime2024/acc/best@2/mean:np.float64(0.16296666666666668) - val-aux/aime2024/acc/best@2/std:np.float64(0.13599034552077963) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.06813333333333332) - val-aux/aime2024/acc/worst@2/std:np.float64(0.08487138189136488) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.1144) - val-aux/aime2024/acc/maj@2/std:np.float64(0.13190730922462585) - val-aux/aime2024/acc/best@4/mean:np.float64(0.20953333333333335) - val-aux/aime2024/acc/best@4/std:np.float64(0.13353543066842033) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.035666666666666666) - val-aux/aime2024/acc/worst@4/std:np.float64(0.05190491585718613) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.1350333333333333) - val-aux/aime2024/acc/maj@4/std:np.float64(0.10897248680644103) - val-aux/aime2024/acc/best@8/mean:np.float64(0.2658) - val-aux/aime2024/acc/best@8/std:np.float64(0.13023052260937654) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.014166666666666668) - val-aux/aime2024/acc/worst@8/std:np.float64(0.032067529950910514) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.1447) - val-aux/aime2024/acc/maj@8/std:np.float64(0.07884962682517418) - val-core/aime2024/acc/best@16/mean:np.float64(0.32883333333333337) - val-core/aime2024/acc/best@16/std:np.float64(0.110938581341787) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.004066666666666667) - val-aux/aime2024/acc/worst@16/std:np.float64(0.014078852070363154) - val-core/aime2024/acc/maj@16/mean:np.float64(0.14833333333333334) - val-core/aime2024/acc/maj@16/std:np.float64(0.05493771280520521) - val-aux/aime2025/reward/mean@16:np.float64(0.07291666666666667) - val-aux/aime2025/reward/std@16:np.float64(0.12495585900865498) - val-aux/aime2025/reward/best@2/mean:np.float64(0.1218) - val-aux/aime2025/reward/best@2/std:np.float64(0.143002686063943) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.021866666666666666) - val-aux/aime2025/reward/worst@2/std:np.float64(0.06811524398544852) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.0725) - val-aux/aime2025/reward/maj@2/std:np.float64(0.12547860656816717) - val-aux/aime2025/reward/best@4/mean:np.float64(0.18433333333333335) - val-aux/aime2025/reward/best@4/std:np.float64(0.1396767239143393) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0036666666666666666) - val-aux/aime2025/reward/worst@4/std:np.float64(0.022575571197701972) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.08776666666666667) - val-aux/aime2025/reward/maj@4/std:np.float64(0.12755164756812437) - val-aux/aime2025/reward/best@8/mean:np.float64(0.24509999999999998) - val-aux/aime2025/reward/best@8/std:np.float64(0.11406887430941023) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0002) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0034046880385629214) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.10450000000000001) - val-aux/aime2025/reward/maj@8/std:np.float64(0.12428814348177199) - val-aux/aime2025/reward/best@16/mean:np.float64(0.29259999999999997) - val-aux/aime2025/reward/best@16/std:np.float64(0.07468123924729826) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.12040000000000001) - val-aux/aime2025/reward/maj@16/std:np.float64(0.11357245803099218) - val-aux/aime2025/score/mean@16:np.float64(0.07291666666666667) - val-aux/aime2025/score/std@16:np.float64(0.12495585900865498) - val-aux/aime2025/score/best@2/mean:np.float64(0.1218) - val-aux/aime2025/score/best@2/std:np.float64(0.143002686063943) - val-aux/aime2025/score/worst@2/mean:np.float64(0.021866666666666666) - val-aux/aime2025/score/worst@2/std:np.float64(0.06811524398544852) - val-aux/aime2025/score/maj@2/mean:np.float64(0.0725) - val-aux/aime2025/score/maj@2/std:np.float64(0.12547860656816717) - val-aux/aime2025/score/best@4/mean:np.float64(0.18433333333333335) - val-aux/aime2025/score/best@4/std:np.float64(0.1396767239143393) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0036666666666666666) - val-aux/aime2025/score/worst@4/std:np.float64(0.022575571197701972) - val-aux/aime2025/score/maj@4/mean:np.float64(0.08776666666666667) - val-aux/aime2025/score/maj@4/std:np.float64(0.12755164756812437) - val-aux/aime2025/score/best@8/mean:np.float64(0.24509999999999998) - val-aux/aime2025/score/best@8/std:np.float64(0.11406887430941023) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0002) - val-aux/aime2025/score/worst@8/std:np.float64(0.0034046880385629214) - val-aux/aime2025/score/maj@8/mean:np.float64(0.10450000000000001) - val-aux/aime2025/score/maj@8/std:np.float64(0.12428814348177199) - val-aux/aime2025/score/best@16/mean:np.float64(0.29259999999999997) - val-aux/aime2025/score/best@16/std:np.float64(0.07468123924729826) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.12040000000000001) - val-aux/aime2025/score/maj@16/std:np.float64(0.11357245803099218) - val-core/aime2025/acc/mean@16:np.float64(0.07291666666666667) - val-aux/aime2025/acc/std@16:np.float64(0.12495585900865498) - val-aux/aime2025/acc/best@2/mean:np.float64(0.1218) - val-aux/aime2025/acc/best@2/std:np.float64(0.143002686063943) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.021866666666666666) - val-aux/aime2025/acc/worst@2/std:np.float64(0.06811524398544852) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.0725) - val-aux/aime2025/acc/maj@2/std:np.float64(0.12547860656816717) - val-aux/aime2025/acc/best@4/mean:np.float64(0.18433333333333335) - val-aux/aime2025/acc/best@4/std:np.float64(0.1396767239143393) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0036666666666666666) - val-aux/aime2025/acc/worst@4/std:np.float64(0.022575571197701972) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.08776666666666667) - val-aux/aime2025/acc/maj@4/std:np.float64(0.12755164756812437) - val-aux/aime2025/acc/best@8/mean:np.float64(0.24509999999999998) - val-aux/aime2025/acc/best@8/std:np.float64(0.11406887430941023) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0002) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0034046880385629214) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.10450000000000001) - val-aux/aime2025/acc/maj@8/std:np.float64(0.12428814348177199) - val-core/aime2025/acc/best@16/mean:np.float64(0.29259999999999997) - val-core/aime2025/acc/best@16/std:np.float64(0.07468123924729826) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.12040000000000001) - val-core/aime2025/acc/maj@16/std:np.float64(0.11357245803099218) - val-aux/math500/reward/mean@4:np.float64(0.695) - val-aux/math500/reward/std@4:np.float64(0.13926279441628822) - val-aux/math500/reward/best@2/mean:np.float64(0.7576759999999999) - val-aux/math500/reward/best@2/std:np.float64(0.11119267732783826) - val-aux/math500/reward/worst@2/mean:np.float64(0.632036) - val-aux/math500/reward/worst@2/std:np.float64(0.1281901221846582) - val-aux/math500/reward/maj@2/mean:np.float64(0.6946939999999999) - val-aux/math500/reward/maj@2/std:np.float64(0.13941245614602693) - val-aux/math500/reward/best@4/mean:np.float64(0.8010940000000001) - val-aux/math500/reward/best@4/std:np.float64(0.06557523030357368) - val-aux/math500/reward/worst@4/mean:np.float64(0.5766359999999999) - val-aux/math500/reward/worst@4/std:np.float64(0.09364536081911788) - val-aux/math500/reward/maj@4/mean:np.float64(0.7108) - val-aux/math500/reward/maj@4/std:np.float64(0.12717362071235183) - val-aux/math500/score/mean@4:np.float64(0.695) - val-aux/math500/score/std@4:np.float64(0.13926279441628822) - val-aux/math500/score/best@2/mean:np.float64(0.7576759999999999) - val-aux/math500/score/best@2/std:np.float64(0.11119267732783826) - val-aux/math500/score/worst@2/mean:np.float64(0.632036) - val-aux/math500/score/worst@2/std:np.float64(0.1281901221846582) - val-aux/math500/score/maj@2/mean:np.float64(0.6946939999999999) - val-aux/math500/score/maj@2/std:np.float64(0.13941245614602693) - val-aux/math500/score/best@4/mean:np.float64(0.8010940000000001) - val-aux/math500/score/best@4/std:np.float64(0.06557523030357368) - val-aux/math500/score/worst@4/mean:np.float64(0.5766359999999999) - val-aux/math500/score/worst@4/std:np.float64(0.09364536081911788) - val-aux/math500/score/maj@4/mean:np.float64(0.7108) - val-aux/math500/score/maj@4/std:np.float64(0.12717362071235183) - val-core/math500/acc/mean@4:np.float64(0.695) - val-aux/math500/acc/std@4:np.float64(0.13926279441628822) - val-aux/math500/acc/best@2/mean:np.float64(0.7576759999999999) - val-aux/math500/acc/best@2/std:np.float64(0.11119267732783826) - val-aux/math500/acc/worst@2/mean:np.float64(0.632036) - val-aux/math500/acc/worst@2/std:np.float64(0.1281901221846582) - val-aux/math500/acc/maj@2/mean:np.float64(0.6946939999999999) - val-aux/math500/acc/maj@2/std:np.float64(0.13941245614602693) - val-core/math500/acc/best@4/mean:np.float64(0.8010940000000001) - val-core/math500/acc/best@4/std:np.float64(0.06557523030357368) - val-aux/math500/acc/worst@4/mean:np.float64(0.5766359999999999) - val-aux/math500/acc/worst@4/std:np.float64(0.09364536081911788) - val-core/math500/acc/maj@4/mean:np.float64(0.7108) - val-core/math500/acc/maj@4/std:np.float64(0.12717362071235183) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.06283783783783783 - val-aux/aime2024/response_length/clip_ratio:0.13333333333333333 - val-aux/aime2025/response_length/clip_ratio:0.08958333333333333 - val-aux/math500/response_length/clip_ratio:0.0395 - val-best/metric:0.29259999999999997 - val-best/step:275.0 - training/global_step:275 - training/epoch:0 - critic/score/mean:0.6322815418243408 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6480870842933655 - critic/rewards/max:1.594557762145996 - critic/rewards/min:-0.07261671870946884 - critic/advantages/mean:-0.037619948387145996 - critic/advantages/max:2.4747860431671143 - critic/advantages/min:-2.474822759628296 - critic/returns/mean:-0.037619948387145996 - critic/returns/max:2.4747860431671143 - critic/returns/min:-2.474822759628296 - response_length/mean:1404.7366943359375 - response_length/max:8192.0 - response_length/min:109.0 - response_length/clip_ratio:0.03519417345523834 - response_length_non_aborted/mean:1404.7366943359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:109.0 - response_length_non_aborted/clip_ratio:0.03519417345523834 - response/aborted_ratio:0.0 - prompt_length/mean:249.77670288085938 - prompt_length/max:879.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.870901703834534e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9695255262777209) - timing_s/agent_loop/generate_sequences/max:np.float64(33.99934625253081) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.289629581023291) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.99934625253081) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.867887454107404 - timing_s/reward:0.00017372332513332367 - timing_s/old_log_prob:12.004906486719847 - timing_s/ref:15.038872112520039 - timing_s/adv:0.12130844034254551 - timing_s/update_actor:26.01027380861342 - timing_s/save_checkpoint:57.412205756641924 - timing_s/update_weights:25.705706354230642 - timing_s/step:173.67854334693402 - timing_s/testing:131.7090217322111 - timing_s/stop_profile:0.000125993974506855 - timing_per_token_ms/adv:8.898023158376397e-05 - timing_per_token_ms/update_actor:0.019078641028705254 - timing_per_token_ms/gen:0.031851224104047594 - timing_per_token_ms/ref:0.011031073514357271 - perf/total_num_tokens:1855241 - perf/time_per_step:173.67854334693402 - perf/throughput:2670.50978815218 - frontier/active_count:40.0 - frontier/completed_count:24.0 - frontier/blacklisted_count:1597.0 - frontier/mean_score:2.6290823540599346 - frontier/mean_frontier_pct:0.6253871731815137 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.2521680314009993 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:3.031302522965459 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.7807441683012997 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:160.0 - frontier/cluster_8/score:2.266175546860089 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.9303241909509097 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.4715112927989997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.0082249800543015 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:3.0038533378850296 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.7712070138992995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.5732177692715092 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.4716378950999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.050816132569999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:112.0 - frontier/cluster_19/score:3.8566161325699992 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.6136393606391826 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.782571934622439 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:2.4875089999999993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.7832977425989993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.3773072277389993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.9205381700233901 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:3.3172598369975654 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.581663782791951 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.805213586402089 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:3.606676358722689 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:192.0 - frontier/cluster_42/score:1.40653676645998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:3.072594475099999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.899552966180699 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.903705352340999 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.667269352066293 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.337907512943039 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:4.204809575763882 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.8466983108999995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.8168036999999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.7044338129999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.095489304959299 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:275.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.021415913692501027 - cluster/prob_snapshot/cluster_3:0.02882472013746926 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.018152192817886926 - cluster/prob_snapshot/cluster_6:0.016933134155643726 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.021549111454806365 - cluster/prob_snapshot/cluster_9:0.018355493771144385 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02350165342845188 - cluster/prob_snapshot/cluster_12:0.028605275291290134 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.02856370525295983 - cluster/prob_snapshot/cluster_15:0.02635146641203432 - cluster/prob_snapshot/cluster_16:0.02446878247554554 - cluster/prob_snapshot/cluster_17:0.0235028572924237 - cluster/prob_snapshot/cluster_18:0.029010275466065247 - cluster/prob_snapshot/cluster_19:0.03667264479766541 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.02485315224723843 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02645953568481299 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014734598580633463 - cluster/prob_snapshot/cluster_26:0.02365377596634324 - cluster/prob_snapshot/cluster_27:0.026466437408293044 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.022605864970983755 - cluster/prob_snapshot/cluster_32:0.01826243829009025 - cluster/prob_snapshot/cluster_33:0.031543894316156736 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0340581170580393 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01380664695476196 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.026674835632955408 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.034295962174341124 - cluster/prob_snapshot/cluster_42:0.013374788015749577 - cluster/prob_snapshot/cluster_43:0.029217366188198472 - cluster/prob_snapshot/cluster_44:0.02757191080095963 - cluster/prob_snapshot/cluster_45:0.027611395929239158 - cluster/prob_snapshot/cluster_46:0.02536312097591189 - cluster/prob_snapshot/cluster_47:0.03174023350569932 - cluster/prob_snapshot/cluster_48:0.014148769262795978 - cluster/prob_snapshot/cluster_49:0.039983623651714886 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.027069314760186324 - cluster/prob_snapshot/cluster_52:0.018444138788622576 - cluster/prob_snapshot/cluster_53:0.026785046269567955 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.025716518625060417 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02943507361208294 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.020216213883673513
[36m(TaskRunner pid=2823680)[0m Training Progress:  34%|███▍      | 276/800 [9:10:15<23:58:27, 164.71s/it]
[36m(TaskRunner pid=2823680)[0m step:276 - global_seqlen/min:411162 - global_seqlen/max:493604 - global_seqlen/minmax_diff:82442 - global_seqlen/balanced_min:456315 - global_seqlen/balanced_max:456491 - global_seqlen/mean:456403.75 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.16961146056976006) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009964700788259506 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.10360020757070743) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0013441870979358614) - actor/ppo_kl:np.float64(-0.0017385873018452992) - actor/pg_clipfrac_lower:np.float64(0.00017135443241663185) - actor/grad_norm:np.float64(0.48834821457664174) - perf/mfu/actor:np.float64(0.2612244435444864) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(114.38310623168945) - actor/lr:np.float64(1e-06) - training/global_step:276 - training/epoch:0 - critic/score/mean:0.6552197933197021 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6700964570045471 - critic/rewards/max:1.6020699739456177 - critic/rewards/min:-0.10707191377878189 - critic/advantages/mean:-0.04884430021047592 - critic/advantages/max:2.474496364593506 - critic/advantages/min:-2.4746761322021484 - critic/returns/mean:-0.04884430021047592 - critic/returns/max:2.474496364593506 - critic/returns/min:-2.4746761322021484 - response_length/mean:1339.97802734375 - response_length/max:8192.0 - response_length/min:174.0 - response_length/clip_ratio:0.021978022530674934 - response_length_non_aborted/mean:1339.97802734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:174.0 - response_length_non_aborted/clip_ratio:0.021978022530674934 - response/aborted_ratio:0.0 - prompt_length/mean:227.12088012695312 - prompt_length/max:386.0 - prompt_length/min:183.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.54693353176117e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.398542488925159) - timing_s/agent_loop/generate_sequences/max:np.float64(32.93310457840562) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.141462926235363) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.93310457840562) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:226 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.2128727408126 - timing_s/reward:0.00038052722811698914 - timing_s/old_log_prob:10.285609848797321 - timing_s/ref:23.046220020391047 - timing_s/adv:0.13700813427567482 - timing_s/update_actor:21.021685468032956 - timing_s/update_weights:29.91936223488301 - timing_s/step:120.1851950129494 - timing_s/stop_profile:5.952734500169754e-05 - timing_per_token_ms/adv:0.00012009324140961356 - timing_per_token_ms/update_actor:0.018426368340070682 - timing_per_token_ms/gen:0.036097107485784374 - timing_per_token_ms/ref:0.020200955798135287 - perf/total_num_tokens:1825615 - perf/time_per_step:120.1851950129494 - perf/throughput:3797.5039267592365 - frontier/active_count:39.0 - frontier/completed_count:25.0 - frontier/blacklisted_count:1634.0 - frontier/mean_score:2.563733799113255 - frontier/mean_frontier_pct:0.632431158352986 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.2521680314009993 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:3.0219117660758212 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:1.9089443929999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:144.0 - frontier/cluster_6/score:1.5465209178109098 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.8863228828020622 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.9303241909509097 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.4715112927989997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:3.0057574860380107 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:3.0038533378850296 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.7712070138992995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.5732177692715092 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.4716378950999998 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.050816132569999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.7295475524474275 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.782571934622439 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:2.6412562999999993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.7832977425989993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.3773072277389993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.9205381700233901 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:3.3172598369975654 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:3.581663782791951 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.805213586402089 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:3.606676358722689 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:208.0 - frontier/cluster_42/score:1.2845757365219859 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:3.072594475099999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.929687076326489 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.332593746638699 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.667269352066293 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.236535259060127 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.843366703034717 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.8926888176299994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.8717625899999994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.7931036690999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.066842513471509 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:276.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.022524919810717133 - cluster/prob_snapshot/cluster_3:0.03022346434940621 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.019092189737146274 - cluster/prob_snapshot/cluster_6:0.01546743367883503 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.018865942096605488 - cluster/prob_snapshot/cluster_9:0.019306018469149904 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.024718667925922238 - cluster/prob_snapshot/cluster_12:0.03006189831286799 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.030042854092429894 - cluster/prob_snapshot/cluster_15:0.027716056216350975 - cluster/prob_snapshot/cluster_16:0.0257358789842591 - cluster/prob_snapshot/cluster_17:0.02471993413103576 - cluster/prob_snapshot/cluster_18:0.03051254959010911 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.02729940167926595 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.027829721770774384 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.015497618090798129 - cluster/prob_snapshot/cluster_26:0.026416362157508345 - cluster/prob_snapshot/cluster_27:0.02783698089453516 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.023776491773106454 - cluster/prob_snapshot/cluster_32:0.01920814418375688 - cluster/prob_snapshot/cluster_33:0.033177369884431436 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.035821789658452875 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.014521613225392865 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.028056170856103752 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03607195139560748 - cluster/prob_snapshot/cluster_42:0.012847605086531287 - cluster/prob_snapshot/cluster_43:0.030730364341166303 - cluster/prob_snapshot/cluster_44:0.029301084796811407 - cluster/prob_snapshot/cluster_45:0.023329292646664827 - cluster/prob_snapshot/cluster_46:0.026676530095093704 - cluster/prob_snapshot/cluster_47:0.032370007991605144 - cluster/prob_snapshot/cluster_48:0.014881452066012687 - cluster/prob_snapshot/cluster_49:0.0384391952918291 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.02893104899190915 - cluster/prob_snapshot/cluster_52:0.01939925389860604 - cluster/prob_snapshot/cluster_53:0.028721756615525784 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.027935054264294972 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03067283645131545 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.021263094498064943
[36m(TaskRunner pid=2823680)[0m Training Progress:  35%|███▍      | 277/800 [9:12:25<22:26:03, 154.42s/it]
[36m(TaskRunner pid=2823680)[0m step:277 - global_seqlen/min:434239 - global_seqlen/max:471294 - global_seqlen/minmax_diff:37055 - global_seqlen/balanced_min:452452 - global_seqlen/balanced_max:452537 - global_seqlen/mean:452511.75 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.18274295345899907) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01187601126730442 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.05432305951035232) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.001815114001495184) - actor/ppo_kl:np.float64(0.0041839825938680825) - actor/pg_clipfrac_lower:np.float64(0.00011717278616839743) - actor/grad_norm:np.float64(0.3586750614146392) - perf/mfu/actor:np.float64(0.2288414953708941) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.67302703857422) - actor/lr:np.float64(1e-06) - training/global_step:277 - training/epoch:0 - critic/score/mean:0.6263440847396851 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6402813196182251 - critic/rewards/max:1.5316128730773926 - critic/rewards/min:-0.08220599591732025 - critic/advantages/mean:-0.11577722430229187 - critic/advantages/max:2.4745113849639893 - critic/advantages/min:-2.4745125770568848 - critic/returns/mean:-0.11577722430229187 - critic/returns/max:2.4745113849639893 - critic/returns/min:-2.4745125770568848 - response_length/mean:1352.8387451171875 - response_length/max:8192.0 - response_length/min:120.0 - response_length/clip_ratio:0.03360215201973915 - response_length_non_aborted/mean:1352.8387451171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:120.0 - response_length_non_aborted/clip_ratio:0.03360215201973915 - response/aborted_ratio:0.0 - prompt_length/mean:235.64515686035156 - prompt_length/max:430.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00011241529136896133 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.992358447983861) - timing_s/agent_loop/generate_sequences/max:np.float64(33.184182341210544) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.02256542737905) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.184182341210544) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.75599100347608 - timing_s/reward:0.00014973897486925125 - timing_s/old_log_prob:12.132693208754063 - timing_s/ref:25.608764024451375 - timing_s/adv:0.06757071800529957 - timing_s/update_actor:23.68577031418681 - timing_s/update_weights:32.570188011974096 - timing_s/step:130.21388896275312 - timing_s/stop_profile:5.333125591278076e-05 - timing_per_token_ms/adv:5.717455442507867e-05 - timing_per_token_ms/update_actor:0.020041571318247273 - timing_per_token_ms/gen:0.03552465445367376 - timing_per_token_ms/ref:0.021668700817418528 - perf/total_num_tokens:1810047 - perf/time_per_step:130.21388896275312 - perf/throughput:3475.141965304778 - frontier/active_count:37.0 - frontier/completed_count:27.0 - frontier/blacklisted_count:1669.0 - frontier/mean_score:2.5236210557477206 - frontier/mean_frontier_pct:0.632538328191246 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.2521680314009993 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.2362610751 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:144.0 - frontier/cluster_6/score:1.5465209178109098 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:176.0 - frontier/cluster_8/score:1.8863228828020622 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6512269336656367 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.4715112927989997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:3.0040302402266073 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:3.0038533378850296 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.7712070138992995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.701252438490056 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:2.63014652657 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.050816132569999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.7295475524474275 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.847800354235707 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:2.7488794099999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.7832977425989993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.3773072277389993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.9205381700233901 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.4071646479543656 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.4519524751 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.805213586402089 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:3.4246734511058823 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:208.0 - frontier/cluster_42/score:1.2845757365219859 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.450816132569999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.929687076326489 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.532815622647089 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:2.667269352066293 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.165574681342089 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.843366703034717 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.9248821723409995 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.9102338129999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.7931036690999997 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.066842513471509 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:277.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.024119867804814357 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.023949510319134927 - cluster/prob_snapshot/cluster_6:0.016562654107018404 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.020201804632702894 - cluster/prob_snapshot/cluster_9:0.01768401593507605 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.026468951174719733 - cluster/prob_snapshot/cluster_12:0.03217202769318127 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0321701331360107 - cluster/prob_snapshot/cluster_15:0.029678578997254384 - cluster/prob_snapshot/cluster_16:0.028929391952732107 - cluster/prob_snapshot/cluster_17:0.02816787452963573 - cluster/prob_snapshot/cluster_18:0.032673086904891555 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.02923242192148156 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.030498864703243242 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01659497582147661 - cluster/prob_snapshot/cluster_26:0.029439458804204503 - cluster/prob_snapshot/cluster_27:0.029808066128691562 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.025460061267642112 - cluster/prob_snapshot/cluster_32:0.020568237417990597 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03648944472707681 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.015549861853113892 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.030042776526846292 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.036676957386677256 - cluster/prob_snapshot/cluster_42:0.013757320287913129 - cluster/prob_snapshot/cluster_43:0.026247313835958453 - cluster/prob_snapshot/cluster_44:0.031375840525766835 - cluster/prob_snapshot/cluster_45:0.02712549736096447 - cluster/prob_snapshot/cluster_46:0.028565445950163703 - cluster/prob_snapshot/cluster_47:0.03390210755844079 - cluster/prob_snapshot/cluster_48:0.01593518021782826 - cluster/prob_snapshot/cluster_49:0.04116100375732447 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.031324381821385555 - cluster/prob_snapshot/cluster_52:0.020772879258987704 - cluster/prob_snapshot/cluster_53:0.031167503433123137 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.029913083892735007 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.032844723382873864 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0227686949709178
[36m(TaskRunner pid=2823680)[0m Training Progress:  35%|███▍      | 278/800 [9:14:28<21:01:37, 145.02s/it]
[36m(TaskRunner pid=2823680)[0m step:278 - global_seqlen/min:353320 - global_seqlen/max:446170 - global_seqlen/minmax_diff:92850 - global_seqlen/balanced_min:389721 - global_seqlen/balanced_max:390007 - global_seqlen/mean:389850.0 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.1917432805628709) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01896487921476364 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.024854164017597213) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0010622161718410491) - actor/ppo_kl:np.float64(2.821386431848177e-05) - actor/pg_clipfrac_lower:np.float64(2.262116166368184e-05) - actor/grad_norm:np.float64(0.6022202404482024) - perf/mfu/actor:np.float64(0.138819384296181) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.1014175415039) - actor/lr:np.float64(1e-06) - training/global_step:278 - training/epoch:0 - critic/score/mean:0.6178571581840515 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6170890927314758 - critic/rewards/max:1.2409518957138062 - critic/rewards/min:-0.11931155622005463 - critic/advantages/mean:-0.11225973069667816 - critic/advantages/max:2.473825693130493 - critic/advantages/min:-2.4739911556243896 - critic/returns/mean:-0.11225973069667816 - critic/returns/max:2.473825693130493 - critic/returns/min:-2.4739911556243896 - response_length/mean:1214.8702392578125 - response_length/max:8192.0 - response_length/min:97.0 - response_length/clip_ratio:0.02619047649204731 - response_length_non_aborted/mean:1214.8702392578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:97.0 - response_length_non_aborted/clip_ratio:0.02619047649204731 - response/aborted_ratio:0.0 - prompt_length/mean:231.39999389648438 - prompt_length/max:568.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.785965085029602e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8960887854918838) - timing_s/agent_loop/generate_sequences/max:np.float64(30.110818617977202) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.839087650463625) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.110818617977202) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:212 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.33387084584683 - timing_s/reward:0.00014717783778905869 - timing_s/old_log_prob:14.963569018058479 - timing_s/ref:15.876238295808434 - timing_s/adv:0.08237164933234453 - timing_s/update_actor:33.65466639492661 - timing_s/update_weights:25.505403670482337 - timing_s/step:122.83622142765671 - timing_s/stop_profile:5.238596349954605e-05 - timing_per_token_ms/adv:6.780301821709251e-05 - timing_per_token_ms/update_actor:0.027702346343201855 - timing_per_token_ms/gen:0.031684621271375085 - timing_per_token_ms/ref:0.013068293315900781 - perf/total_num_tokens:1559400 - perf/time_per_step:122.83622142765671 - perf/throughput:3173.7381325230576 - frontier/active_count:35.0 - frontier/completed_count:29.0 - frontier/blacklisted_count:1692.0 - frontier/mean_score:2.5385375510012897 - frontier/mean_frontier_pct:0.6402282828301195 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.2521680314009993 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.2362610751 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.3825646424676368 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:192.0 - frontier/cluster_8/score:1.6204260179614436 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6512269336656367 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.4715112927989997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:3.0040302402266073 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:3.0038533378850296 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.7712070138992995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.701252438490056 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.141102568599 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.0355712927989993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.7295475524474275 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.847800354235707 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:2.7488794099999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.7832977425989993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.5641150594172992 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.9205381700233901 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.4071646479543656 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:1.9163667325699998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:3.4636495104814617 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:3.297271415774117 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:208.0 - frontier/cluster_42/score:1.2845757365219859 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.450816132569999 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.9507809534285423 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.532815622647089 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:3.5903566921243018 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.9474175206386994 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:3.5371636690999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.8551725683699996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:3.6467897594300562 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:278.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.025348318371200593 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.025169284397264072 - cluster/prob_snapshot/cluster_6:0.015560867678347256 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.01823801511591124 - cluster/prob_snapshot/cluster_9:0.018584681708505007 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.027817043060022015 - cluster/prob_snapshot/cluster_12:0.03381058334204737 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.033808592293061504 - cluster/prob_snapshot/cluster_15:0.03119014064733269 - cluster/prob_snapshot/cluster_16:0.03040279670839373 - cluster/prob_snapshot/cluster_17:0.024098268343006872 - cluster/prob_snapshot/cluster_18:0.03416558023003286 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03072126031632208 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03205220437832048 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.017440175621576248 - cluster/prob_snapshot/cluster_26:0.030938841808074477 - cluster/prob_snapshot/cluster_27:0.031326222696339105 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.02885930532726182 - cluster/prob_snapshot/cluster_32:0.021615799664606795 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03834789102557391 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02156884982642429 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03898363234617141 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03711103454004234 - cluster/prob_snapshot/cluster_42:0.014457995268240775 - cluster/prob_snapshot/cluster_43:0.027584117495448847 - cluster/prob_snapshot/cluster_44:0.03321125866645597 - cluster/prob_snapshot/cluster_45:0.028507027843064952 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.016746775924839162 - cluster/prob_snapshot/cluster_49:0.04040973108099906 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.03317340298073965 - cluster/prob_snapshot/cluster_52:0.021830864132606407 - cluster/prob_snapshot/cluster_53:0.039811039658357805 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.03213517919563927 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04104488944253561 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.02392832886523444
[36m(TaskRunner pid=2823680)[0m Training Progress:  35%|███▍      | 279/800 [9:16:24<19:42:44, 136.21s/it]
[36m(TaskRunner pid=2823680)[0m step:279 - global_seqlen/min:362299 - global_seqlen/max:561461 - global_seqlen/minmax_diff:199162 - global_seqlen/balanced_min:444488 - global_seqlen/balanced_max:444593 - global_seqlen/mean:444522.25 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.17785664577968419) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.018102165311574936 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.010821305852005025) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.002883440938813482) - actor/ppo_kl:np.float64(0.006546507019020982) - actor/pg_clipfrac_lower:np.float64(0.0002592888999445747) - actor/grad_norm:np.float64(0.7669506737819085) - perf/mfu/actor:np.float64(0.18578162017197639) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.10855960845947) - actor/lr:np.float64(1e-06) - training/global_step:279 - training/epoch:0 - critic/score/mean:0.6009615659713745 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6082864999771118 - critic/rewards/max:1.3301957845687866 - critic/rewards/min:-0.20790664851665497 - critic/advantages/mean:-0.02787037566304207 - critic/advantages/max:2.4707233905792236 - critic/advantages/min:-2.4742274284362793 - critic/returns/mean:-0.02787037566304207 - critic/returns/max:2.4707233905792236 - critic/returns/min:-2.4742274284362793 - response_length/mean:1361.5625 - response_length/max:8192.0 - response_length/min:167.0 - response_length/clip_ratio:0.028846153989434242 - response_length_non_aborted/mean:1361.5625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:167.0 - response_length_non_aborted/clip_ratio:0.028846153989434242 - response/aborted_ratio:0.0 - prompt_length/mean:231.32691955566406 - prompt_length/max:657.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00013920292258262634 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0403094999492168) - timing_s/agent_loop/generate_sequences/max:np.float64(32.89292038511485) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.7775396192564585) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.89292038511485) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.23645600862801 - timing_s/reward:0.0001506740227341652 - timing_s/old_log_prob:11.781317645683885 - timing_s/ref:15.08554974757135 - timing_s/adv:0.07634955924004316 - timing_s/update_actor:28.45251856558025 - timing_s/update_weights:25.393817076459527 - timing_s/step:115.44368568900973 - timing_s/stop_profile:7.247831672430038e-05 - timing_per_token_ms/adv:5.7609960763159565e-05 - timing_per_token_ms/update_actor:0.021468997260647715 - timing_per_token_ms/gen:0.030222326590833504 - timing_per_token_ms/ref:0.011382880761837726 - perf/total_num_tokens:1778089 - perf/time_per_step:115.44368568900973 - perf/throughput:3850.554903431316 - frontier/active_count:35.0 - frontier/completed_count:29.0 - frontier/blacklisted_count:1716.0 - frontier/mean_score:2.520499550363606 - frontier/mean_frontier_pct:0.6597143991932418 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.2521680314009993 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.2362610751 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.3825646424676368 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:192.0 - frontier/cluster_8/score:1.6204260179614436 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6512269336656367 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.6300579049592994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:3.0028211681586248 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:2.4026973365195206 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.7712070138992995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.701252438490056 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.141102568599 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.0355712927989993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.7295475524474275 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:160.0 - frontier/cluster_23/score:2.293460247964995 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:2.8242155869999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.7832977425989993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:2.5641150594172992 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:208.0 - frontier/cluster_32/score:1.644376719016373 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.2850152535680555 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.2414567127989997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:3.324554657337023 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:3.2080899910418816 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:208.0 - frontier/cluster_42/score:1.2845757365219859 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.6155712927989994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.9507809534285423 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:2.072970935852962 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.4879311840192997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:4.013249684487011 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.9631922644470894 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.9396463929999996 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:3.5371636690999995 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.8986207978589995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:4.052752831601039 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:279.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.025529724070272417 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.025349408836462008 - cluster/prob_snapshot/cluster_6:0.015672229309443144 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.01836853580108392 - cluster/prob_snapshot/cluster_9:0.018717683319420114 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.029813340597273818 - cluster/prob_snapshot/cluster_12:0.03403884381032368 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.027236067278498807 - cluster/prob_snapshot/cluster_15:0.03141335345322463 - cluster/prob_snapshot/cluster_16:0.03062037487314058 - cluster/prob_snapshot/cluster_17:0.024270728036433726 - cluster/prob_snapshot/cluster_18:0.03441008682313551 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0309411175716413 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.025997797002855944 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.017564986553971745 - cluster/prob_snapshot/cluster_26:0.032014238567368576 - cluster/prob_snapshot/cluster_27:0.03155040937587701 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.02906583746801192 - cluster/prob_snapshot/cluster_32:0.01864003187983856 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.03723768911596639 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.025408304618204394 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03768589282636933 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03636569346602155 - cluster/prob_snapshot/cluster_42:0.014561464172978541 - cluster/prob_snapshot/cluster_43:0.029649125846860438 - cluster/prob_snapshot/cluster_44:0.03344893564002155 - cluster/prob_snapshot/cluster_45:0.023498413644162795 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0168666245297587 - cluster/prob_snapshot/cluster_49:0.04549267889497837 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.033589625562458966 - cluster/prob_snapshot/cluster_52:0.021987097106775492 - cluster/prob_snapshot/cluster_53:0.04009594808400729 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.032857667865776236 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0459404716136687 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.02409957238371376
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 20:48:49,644:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  35%|███▌      | 280/800 [9:18:22<18:53:28, 130.79s/it]
[36m(TaskRunner pid=2823680)[0m step:280 - global_seqlen/min:366376 - global_seqlen/max:507376 - global_seqlen/minmax_diff:141000 - global_seqlen/balanced_min:443639 - global_seqlen/balanced_max:443683 - global_seqlen/mean:443657.5 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.16250162245705724) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012125201523303986 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.013209164899308234) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006302832141796978) - actor/ppo_kl:np.float64(7.168507459670329e-05) - actor/pg_clipfrac_lower:np.float64(1.2014715732378965e-05) - actor/grad_norm:np.float64(0.4641767541567485) - perf/mfu/actor:np.float64(0.2520514378850116) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.58818817138672) - actor/lr:np.float64(1e-06) - training/global_step:280 - training/epoch:0 - critic/score/mean:0.6208791136741638 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6307137608528137 - critic/rewards/max:1.3192275762557983 - critic/rewards/min:-0.13535484671592712 - critic/advantages/mean:-0.03618789464235306 - critic/advantages/max:2.4747064113616943 - critic/advantages/min:-2.4747848510742188 - critic/returns/mean:-0.03618789464235306 - critic/returns/max:2.4747064113616943 - critic/returns/min:-2.4747848510742188 - response_length/mean:1296.061767578125 - response_length/max:8192.0 - response_length/min:91.0 - response_length/clip_ratio:0.031593408435583115 - response_length_non_aborted/mean:1296.061767578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:91.0 - response_length_non_aborted/clip_ratio:0.031593408435583115 - response/aborted_ratio:0.0 - prompt_length/mean:237.05494689941406 - prompt_length/max:962.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.647756814956665e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1390087874606252) - timing_s/agent_loop/generate_sequences/max:np.float64(32.662943934090436) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.21381879384353) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.662943934090436) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.131651093252 - timing_s/reward:0.00019510649144649506 - timing_s/old_log_prob:10.031570657156408 - timing_s/ref:22.999345937743783 - timing_s/adv:0.07387344352900982 - timing_s/update_actor:21.026687543839216 - timing_s/update_weights:29.24086124636233 - timing_s/step:117.89447095338255 - timing_s/stop_profile:6.924662739038467e-05 - timing_per_token_ms/adv:6.618837723646151e-05 - timing_per_token_ms/update_actor:0.01883927783383094 - timing_per_token_ms/gen:0.036174305608020076 - timing_per_token_ms/ref:0.020606720255587744 - perf/total_num_tokens:1774630 - perf/time_per_step:117.89447095338255 - perf/throughput:3763.1747817540113 - frontier/active_count:33.0 - frontier/completed_count:31.0 - frontier/blacklisted_count:1753.0 - frontier/mean_score:2.576616916646796 - frontier/mean_frontier_pct:0.6535884718671533 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.2521680314009993 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.8653827525699997 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.3825646424676368 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:192.0 - frontier/cluster_8/score:2.0342982125730105 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6512269336656367 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.7410405334715096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:208.0 - frontier/cluster_12/score:3.0019748177110372 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:2.4026973365195206 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.7712070138992995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.790876706943039 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.141102568599 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.0355712927989993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.7295475524474275 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:160.0 - frontier/cluster_23/score:2.5054221735754965 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5495389248999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.8769509108999993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.7832977425989993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:208.0 - frontier/cluster_32/score:2.0510637033114607 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:3.2850152535680555 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.2414567127989997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:3.324554657337023 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:3.2080899910418816 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:208.0 - frontier/cluster_42/score:1.2845757365219859 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.7308999049592995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.9507809534285423 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:2.351079655097073 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:1.9415518288135096 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:4.013249684487011 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.9631922644470894 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:3.3760145683699996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.9290345585012996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:3.736926982120727 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:280.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.026487257636993927 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.021938360224477055 - cluster/prob_snapshot/cluster_6:0.016260041601806406 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.023924938155426505 - cluster/prob_snapshot/cluster_9:0.019419720287006375 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03223678064479915 - cluster/prob_snapshot/cluster_12:0.03530557192351986 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.028257600005325872 - cluster/prob_snapshot/cluster_15:0.03259156205007252 - cluster/prob_snapshot/cluster_16:0.032822893025393235 - cluster/prob_snapshot/cluster_17:0.025181040921903392 - cluster/prob_snapshot/cluster_18:0.03570069274885017 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03210161419844276 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.029465724437820934 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.018223789766186032 - cluster/prob_snapshot/cluster_26:0.03383519298894832 - cluster/prob_snapshot/cluster_27:0.03273375847663699 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.024122113440044907 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.038634348836123614 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0263612841521174 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03909936315114801 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03772964757986048 - cluster/prob_snapshot/cluster_42:0.015107615423491873 - cluster/prob_snapshot/cluster_43:0.03211751892175761 - cluster/prob_snapshot/cluster_44:0.03470349203703811 - cluster/prob_snapshot/cluster_45:0.027650535697779446 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.02283416813858667 - cluster/prob_snapshot/cluster_49:0.0471989553499107 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.03484945876242344 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03970457195551033 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.034447737423224774 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.043949184237595765 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.025003465799777393
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 20:50:45,106:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  35%|███▌      | 281/800 [9:20:34<18:54:24, 131.15s/it]
[36m(TaskRunner pid=2823680)[0m step:281 - global_seqlen/min:450585 - global_seqlen/max:530952 - global_seqlen/minmax_diff:80367 - global_seqlen/balanced_min:486550 - global_seqlen/balanced_max:486652 - global_seqlen/mean:486586.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.16180678268687593) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010035009123384953 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.022896541115187574) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00077660012256173) - actor/ppo_kl:np.float64(-2.9037452673542754e-05) - actor/pg_clipfrac_lower:np.float64(2.4100305624112175e-05) - actor/grad_norm:np.float64(0.40208728859821957) - perf/mfu/actor:np.float64(0.2407059167917757) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(114.04080200195312) - actor/lr:np.float64(1e-06) - training/global_step:281 - training/epoch:0 - critic/score/mean:0.5379213690757751 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5554866790771484 - critic/rewards/max:1.4085099697113037 - critic/rewards/min:-0.276589572429657 - critic/advantages/mean:-0.017086971551179886 - critic/advantages/max:2.4747393131256104 - critic/advantages/min:-2.4748375415802 - critic/returns/mean:-0.017086971551179886 - critic/returns/max:2.4747393131256104 - critic/returns/min:-2.4748375415802 - response_length/mean:1553.02392578125 - response_length/max:8192.0 - response_length/min:215.0 - response_length/clip_ratio:0.04353932663798332 - response_length_non_aborted/mean:1553.02392578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:215.0 - response_length_non_aborted/clip_ratio:0.04353932663798332 - response/aborted_ratio:0.0 - prompt_length/mean:236.01123046875 - prompt_length/max:657.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.94591212272644e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7629104116931558) - timing_s/agent_loop/generate_sequences/max:np.float64(34.648823741823435) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.036260826670514) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.648823741823435) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.67524656839669 - timing_s/reward:0.00013693887740373611 - timing_s/old_log_prob:10.867832881398499 - timing_s/ref:25.988549016416073 - timing_s/adv:0.06399104557931423 - timing_s/update_actor:24.14711920544505 - timing_s/update_weights:33.63216115627438 - timing_s/step:131.7741642249748 - timing_s/stop_profile:5.679205060005188e-05 - timing_per_token_ms/adv:5.0236612682998124e-05 - timing_per_token_ms/update_actor:0.01895686285404697 - timing_per_token_ms/gen:0.033167666348991764 - timing_per_token_ms/ref:0.020402490056403256 - perf/total_num_tokens:1946346 - perf/time_per_step:131.7741642249748 - perf/throughput:3692.5789122764822 - frontier/active_count:31.0 - frontier/completed_count:33.0 - frontier/blacklisted_count:1792.0 - frontier/mean_score:2.567598313824168 - frontier/mean_frontier_pct:0.6704369295179446 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.2521680314009993 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:2.2057679267989996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.3825646424676368 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:208.0 - frontier/cluster_8/score:2.3240087488011074 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:2.0558588535659457 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.7410405334715096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:224.0 - frontier/cluster_12/score:2.401382372397726 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:2.5818881355636645 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:2.7712070138992995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.790876706943039 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.141102568599 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.0355712927989993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:2.810683286713199 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.053795521502847 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3846772474299998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.8769509108999993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.8483084198192996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:208.0 - frontier/cluster_32/score:2.0510637033114607 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.2414567127989997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:3.227188260135916 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:3.145662993729317 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.7308999049592995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:2.3655466673999794 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:2.351079655097073 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:160.0 - frontier/cluster_48/score:2.2590862801694565 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:4.013249684487011 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.9631922644470894 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:3.3760145683699996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.3503241909509094 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:3.736926982120727 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:281.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.028295150868007598 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.02771220237495568 - cluster/prob_snapshot/cluster_6:0.01736987409374583 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.02919772292700155 - cluster/prob_snapshot/cluster_9:0.025828817216978 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.034437108754116375 - cluster/prob_snapshot/cluster_12:0.03016980774587222 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.032437594931426006 - cluster/prob_snapshot/cluster_15:0.0348161058373536 - cluster/prob_snapshot/cluster_16:0.03506322635608941 - cluster/prob_snapshot/cluster_17:0.02689977806171981 - cluster/prob_snapshot/cluster_18:0.038137450893000205 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.035312066653510416 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02580289450996761 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01739641584165324 - cluster/prob_snapshot/cluster_26:0.03614462106236752 - cluster/prob_snapshot/cluster_27:0.03578477064487456 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.025768573266263087 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02816057903694829 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.04054483387867175 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.039520589825667286 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.03430970679757989 - cluster/prob_snapshot/cluster_44:0.029719585264585204 - cluster/prob_snapshot/cluster_45:0.029537828712670463 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0283820684025417 - cluster/prob_snapshot/cluster_49:0.050420529778546205 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.03722811575532706 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.04241461570040928 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.02952833742618871 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.04694894486893985 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.026710082513021614
[36m(TaskRunner pid=2823680)[0m Training Progress:  35%|███▌      | 282/800 [9:22:48<18:58:57, 131.93s/it]
[36m(TaskRunner pid=2823680)[0m step:282 - global_seqlen/min:346941 - global_seqlen/max:569763 - global_seqlen/minmax_diff:222822 - global_seqlen/balanced_min:484526 - global_seqlen/balanced_max:484755 - global_seqlen/mean:484592.0 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.182423132511371) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011017953976988792 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.043686154938768595) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007936082774323679) - actor/ppo_kl:np.float64(4.2193618677377714e-05) - actor/pg_clipfrac_lower:np.float64(7.79224336973981e-06) - actor/grad_norm:np.float64(0.5097823577622572) - perf/mfu/actor:np.float64(0.2562327773042513) - perf/max_memory_allocated_gb:np.float64(91.36168622970581) - perf/max_memory_reserved_gb:np.float64(97.48828125) - perf/cpu_memory_used_gb:np.float64(113.67049789428711) - actor/lr:np.float64(1e-06) - training/global_step:282 - training/epoch:0 - critic/score/mean:0.5784574747085571 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5984132289886475 - critic/rewards/max:1.5826256275177002 - critic/rewards/min:-0.09758910536766052 - critic/advantages/mean:-0.10945277661085129 - critic/advantages/max:2.4744796752929688 - critic/advantages/min:-2.474858522415161 - critic/returns/mean:-0.10945277661085129 - critic/returns/max:2.4744796752929688 - critic/returns/min:-2.474858522415161 - response_length/mean:1483.49072265625 - response_length/max:8192.0 - response_length/min:91.0 - response_length/clip_ratio:0.05718085169792175 - response_length_non_aborted/mean:1483.49072265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:91.0 - response_length_non_aborted/clip_ratio:0.05718085169792175 - response/aborted_ratio:0.0 - prompt_length/mean:232.98936462402344 - prompt_length/max:380.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.329283446073532e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.7946249824017286) - timing_s/agent_loop/generate_sequences/max:np.float64(36.87869427911937) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.581368075376304) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(36.87869427911937) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:39.285349500365555 - timing_s/reward:0.00012899748980998993 - timing_s/old_log_prob:11.058526592329144 - timing_s/ref:26.4209230709821 - timing_s/adv:0.06410250999033451 - timing_s/update_actor:23.14010967500508 - timing_s/update_weights:33.20801263861358 - timing_s/step:133.5648501170799 - timing_s/stop_profile:5.061645060777664e-05 - timing_per_token_ms/adv:4.966133995949352e-05 - timing_per_token_ms/update_actor:0.01792704924415075 - timing_per_token_ms/gen:0.03521502126719663 - timing_per_token_ms/ref:0.020468752984391842 - perf/total_num_tokens:1938368 - perf/time_per_step:133.5648501170799 - perf/throughput:3628.1401849005756 - frontier/active_count:30.0 - frontier/completed_count:34.0 - frontier/blacklisted_count:1826.0 - frontier/mean_score:2.5613993869202174 - frontier/mean_frontier_pct:0.6821172678347971 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.2521680314009993 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:2.2057679267989996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.3825646424676368 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:208.0 - frontier/cluster_8/score:2.526806124160775 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:2.0558588535659457 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.7410405334715096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:240.0 - frontier/cluster_12/score:1.9809676606784081 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:2.8398449097295093 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.790876706943039 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.3987717980192995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.0355712927989993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:2.867478300699239 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.737656865051993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3846772474299998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.9138656376299994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.8483084198192996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:208.0 - frontier/cluster_32/score:2.0510637033114607 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:1.8690196989592998 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:3.227188260135916 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:3.1019640956105214 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.7308999049592995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:2.5558826671799855 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:160.0 - frontier/cluster_45/score:2.545755758567951 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:160.0 - frontier/cluster_48/score:2.2590862801694565 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.7092747791409075 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.9742345851129626 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.2632101978589994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.3503241909509094 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:4.115848887484509 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:128.0 - frontier/cluster_63/score:2.1260036474986994 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:282.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.02930908318972944 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.028705245240326447 - cluster/prob_snapshot/cluster_6:0.017992308547269138 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.03288314631270062 - cluster/prob_snapshot/cluster_9:0.026754370599969508 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03567113283814867 - cluster/prob_snapshot/cluster_12:0.025779757617316754 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.036956945283259005 - cluster/prob_snapshot/cluster_16:0.03631968682424989 - cluster/prob_snapshot/cluster_17:0.03121694349670242 - cluster/prob_snapshot/cluster_18:0.03950407354537185 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.03731655848416316 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.022613379154188595 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.018019801395815795 - cluster/prob_snapshot/cluster_26:0.03792023027867306 - cluster/prob_snapshot/cluster_27:0.03706708703016253 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.02669196785925192 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.024322898236844623 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.041997723023536125 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.04036809086276167 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0355391655424067 - cluster/prob_snapshot/cluster_44:0.033261540289676504 - cluster/prob_snapshot/cluster_45:0.033129751540867974 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.02939911533913164 - cluster/prob_snapshot/cluster_49:0.04827146179132019 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.038705854832074574 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.04246650241950523 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.030586459924378792 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.05356250322254943 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.027667215277647757
[36m(TaskRunner pid=2823680)[0m Training Progress:  35%|███▌      | 283/800 [9:25:08<19:18:02, 134.40s/it]
[36m(TaskRunner pid=2823680)[0m step:283 - global_seqlen/min:430017 - global_seqlen/max:568424 - global_seqlen/minmax_diff:138407 - global_seqlen/balanced_min:518605 - global_seqlen/balanced_max:518674 - global_seqlen/mean:518647.25 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.1757771824909882) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.006926394533365965 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.004759312862006482) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000902881327171847) - actor/ppo_kl:np.float64(0.0001217376008643033) - actor/pg_clipfrac_lower:np.float64(1.2975984149802985e-05) - actor/grad_norm:np.float64(0.5418330729007721) - perf/mfu/actor:np.float64(0.276431252820328) - perf/max_memory_allocated_gb:np.float64(93.24382066726685) - perf/max_memory_reserved_gb:np.float64(99.0078125) - perf/cpu_memory_used_gb:np.float64(113.57842254638672) - actor/lr:np.float64(1e-06) - training/global_step:283 - training/epoch:0 - critic/score/mean:0.6022727489471436 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6348970532417297 - critic/rewards/max:1.5977590084075928 - critic/rewards/min:-0.08738682419061661 - critic/advantages/mean:-0.09504754096269608 - critic/advantages/max:2.4742677211761475 - critic/advantages/min:-2.4747776985168457 - critic/returns/mean:-0.09504754096269608 - critic/returns/max:2.4742677211761475 - critic/returns/min:-2.4747776985168457 - response_length/mean:1701.8621826171875 - response_length/max:8192.0 - response_length/min:134.0 - response_length/clip_ratio:0.06676136702299118 - response_length_non_aborted/mean:1701.8621826171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:134.0 - response_length_non_aborted/clip_ratio:0.06676136702299118 - response/aborted_ratio:0.0 - prompt_length/mean:236.01136779785156 - prompt_length/max:500.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.917320519685745e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1994093274697661) - timing_s/agent_loop/generate_sequences/max:np.float64(37.76342280115932) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.496774538535647) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(37.76342280115932) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:194 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:39.604349481873214 - timing_s/reward:0.00013478286564350128 - timing_s/old_log_prob:11.285721527412534 - timing_s/ref:30.120240012183785 - timing_s/adv:0.06873288098722696 - timing_s/update_actor:22.66843743994832 - timing_s/update_weights:35.39781566336751 - timing_s/step:139.51624836865813 - timing_s/stop_profile:5.22444024682045e-05 - timing_per_token_ms/adv:5.0380960993024776e-05 - timing_per_token_ms/update_actor:0.016615885236166576 - timing_per_token_ms/gen:0.03305565968584982 - timing_per_token_ms/ref:0.022078030418023347 - perf/total_num_tokens:2074589 - perf/time_per_step:139.51624836865813 - perf/throughput:3717.4684387263987 - frontier/active_count:28.0 - frontier/completed_count:36.0 - frontier/blacklisted_count:1865.0 - frontier/mean_score:2.4606800620837483 - frontier/mean_frontier_pct:0.6835477190232775 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.4765176219806992 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:2.2057679267989996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.3825646424676368 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:224.0 - frontier/cluster_8/score:2.668764286912542 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.339101197496162 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.7410405334715096 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:240.0 - frontier/cluster_12/score:1.9809676606784081 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.287891436810656 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.790876706943039 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.3987717980192995 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.024899904959299 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.737656865051993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3846772474299998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.9138656376299994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.2938158938735094 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:208.0 - frontier/cluster_32/score:2.0510637033114607 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.20831378927151 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:3.159031782095141 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:3.1019640956105214 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.8116299334715094 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:2.5558826671799855 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:176.0 - frontier/cluster_45/score:2.0820290309975653 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:176.0 - frontier/cluster_48/score:1.8813603961186194 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.9819642095790737 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.1842471385012994 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.3503241909509094 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:4.115848887484509 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:144.0 - frontier/cluster_63/score:1.7882025532490895 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:283.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.03594415189961086 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03201449354224335 - cluster/prob_snapshot/cluster_6:0.020066529338944154 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.03873441806415245 - cluster/prob_snapshot/cluster_9:0.03394969129439086 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03978343478100967 - cluster/prob_snapshot/cluster_12:0.02875174473690719 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0332064333419801 - cluster/prob_snapshot/cluster_16:0.040506756502388476 - cluster/prob_snapshot/cluster_17:0.03481574979125278 - cluster/prob_snapshot/cluster_18:0.04390336684865425 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.025220334292181026 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.02009719166615599 - cluster/prob_snapshot/cluster_26:0.04229181660749168 - cluster/prob_snapshot/cluster_27:0.03329242085230502 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.02976911799587432 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.032051444164607285 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.04585015556663423 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.04502187573799245 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.04080796862384706 - cluster/prob_snapshot/cluster_44:0.037096055368757604 - cluster/prob_snapshot/cluster_45:0.0302185484509988 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.027306045899206904 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.043280198597007435 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.046216130996334946 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.03411258170870705 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.059737389427201304 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.025953953903162727
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▌      | 284/800 [9:27:13<18:52:23, 131.67s/it]
[36m(TaskRunner pid=2823680)[0m step:284 - global_seqlen/min:401315 - global_seqlen/max:558806 - global_seqlen/minmax_diff:157491 - global_seqlen/balanced_min:465245 - global_seqlen/balanced_max:465418 - global_seqlen/mean:465337.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.18003320280695334) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012432216666638851 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.02226054429775104) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008708954804509025) - actor/ppo_kl:np.float64(9.489778012294892e-05) - actor/pg_clipfrac_lower:np.float64(1.0281338745699031e-05) - actor/grad_norm:np.float64(0.42630308493971825) - perf/mfu/actor:np.float64(0.2554955751683129) - perf/max_memory_allocated_gb:np.float64(93.24382066726685) - perf/max_memory_reserved_gb:np.float64(99.0078125) - perf/cpu_memory_used_gb:np.float64(114.15263748168945) - actor/lr:np.float64(1e-06) - training/global_step:284 - training/epoch:0 - critic/score/mean:0.5442708134651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5570834875106812 - critic/rewards/max:2.130232334136963 - critic/rewards/min:-0.07394160330295563 - critic/advantages/mean:-0.012348085641860962 - critic/advantages/max:2.4747321605682373 - critic/advantages/min:-2.474787473678589 - critic/returns/mean:-0.012348085641860962 - critic/returns/max:2.4747321605682373 - critic/returns/min:-2.474787473678589 - response_length/mean:1363.5013427734375 - response_length/max:8192.0 - response_length/min:163.0 - response_length/clip_ratio:0.02994791604578495 - response_length_non_aborted/mean:1363.5013427734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:163.0 - response_length_non_aborted/clip_ratio:0.02994791604578495 - response/aborted_ratio:0.0 - prompt_length/mean:235.1458282470703 - prompt_length/max:353.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.636565715074539e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3759423829615116) - timing_s/agent_loop/generate_sequences/max:np.float64(33.97064333502203) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.344591904395202) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.97064333502203) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:215 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.85343265067786 - timing_s/reward:0.0016465261578559875 - timing_s/old_log_prob:10.943633857183158 - timing_s/ref:24.97085855808109 - timing_s/adv:0.09281112719327211 - timing_s/update_actor:21.895702488720417 - timing_s/update_weights:30.946461306884885 - timing_s/step:125.08517830539495 - timing_s/stop_profile:5.6898221373558044e-05 - timing_per_token_ms/adv:7.559380628092285e-05 - timing_per_token_ms/update_actor:0.017833847539317844 - timing_per_token_ms/gen:0.03423843968898799 - timing_per_token_ms/ref:0.02033853376844605 - perf/total_num_tokens:1861350 - perf/time_per_step:125.08517830539495 - perf/throughput:3720.164981208888 - frontier/active_count:26.0 - frontier/completed_count:38.0 - frontier/blacklisted_count:1897.0 - frontier/mean_score:2.4195910646881167 - frontier/mean_frontier_pct:0.6928782167369388 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:112.0 - frontier/cluster_2/score:2.4765176219806992 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:2.2057679267989996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:176.0 - frontier/cluster_6/score:1.2677952497273457 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:224.0 - frontier/cluster_8/score:2.668764286912542 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.5373708382473135 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.8187283734300568 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.287891436810656 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.790876706943039 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:2.5791402586135095 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.017429933471509 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.737656865051993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3846772474299998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.9138656376299994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.5056711257114563 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:224.0 - frontier/cluster_32/score:1.7357445923180224 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.445819652490057 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:3.159031782095141 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:3.1019640956105214 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.868140953430056 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:2.6891178670259896 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:176.0 - frontier/cluster_45/score:2.0820290309975653 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:192.0 - frontier/cluster_48/score:1.6169522772830336 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.9873749467053514 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.1289729969509095 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.3503241909509094 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:160.0 - frontier/cluster_63/score:1.5517417872743626 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:284.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.03936643640265971 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.035062630703152776 - cluster/prob_snapshot/cluster_6:0.02015272600001495 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.04242236705362352 - cluster/prob_snapshot/cluster_9:0.04033375205864159 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0448061787504097 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.036368056477370125 - cluster/prob_snapshot/cluster_16:0.04436345189567714 - cluster/prob_snapshot/cluster_17:0.040997713912103675 - cluster/prob_snapshot/cluster_18:0.0479647156641204 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.027621591649731228 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.022010668656404095 - cluster/prob_snapshot/cluster_26:0.04631846964929484 - cluster/prob_snapshot/cluster_27:0.03982985711097417 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.02759119438451565 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.03887846504605872 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.05021560219885175 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.049308460884489966 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.045591635381445325 - cluster/prob_snapshot/cluster_44:0.04274590519847291 - cluster/prob_snapshot/cluster_45:0.03309569159120469 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.02570288554574816 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.047486965086207526 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.04973779124236229 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.03736048028388416 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.024666307172580976
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▌      | 285/800 [9:29:18<18:31:46, 129.53s/it]
[36m(TaskRunner pid=2823680)[0m step:285 - global_seqlen/min:418934 - global_seqlen/max:547001 - global_seqlen/minmax_diff:128067 - global_seqlen/balanced_min:466580 - global_seqlen/balanced_max:466641 - global_seqlen/mean:466608.0 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.20904712703858697) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01466620434075594 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.011066109771491028) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006455285437432182) - actor/ppo_kl:np.float64(-1.8183840206392615e-05) - actor/pg_clipfrac_lower:np.float64(3.714456334121276e-06) - actor/grad_norm:np.float64(0.5577120076526295) - perf/mfu/actor:np.float64(0.2682040987719755) - perf/max_memory_allocated_gb:np.float64(93.24382066726685) - perf/max_memory_reserved_gb:np.float64(99.0078125) - perf/cpu_memory_used_gb:np.float64(113.6851692199707) - actor/lr:np.float64(1e-06) - training/global_step:285 - training/epoch:0 - critic/score/mean:0.5602940917015076 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5682884454727173 - critic/rewards/max:1.9447131156921387 - critic/rewards/min:-0.3027799427509308 - critic/advantages/mean:-0.03344135358929634 - critic/advantages/max:2.47464919090271 - critic/advantages/min:-2.474775552749634 - critic/returns/mean:-0.03344135358929634 - critic/returns/max:2.47464919090271 - critic/returns/min:-2.474775552749634 - response_length/mean:1342.7220458984375 - response_length/max:8192.0 - response_length/min:172.0 - response_length/clip_ratio:0.030882352963089943 - response_length_non_aborted/mean:1342.7220458984375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:172.0 - response_length_non_aborted/clip_ratio:0.030882352963089943 - response/aborted_ratio:0.0 - prompt_length/mean:244.69412231445312 - prompt_length/max:962.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.119976311922073e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3614618266001344) - timing_s/agent_loop/generate_sequences/max:np.float64(35.80204998888075) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.242772546608649) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.80204998888075) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:183 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.9946718364954 - timing_s/reward:0.00026709772646427155 - timing_s/old_log_prob:9.941203116439283 - timing_s/ref:23.54627327620983 - timing_s/adv:0.06754562444984913 - timing_s/update_actor:21.214942794293165 - timing_s/update_weights:30.346769904717803 - timing_s/step:123.51663697883487 - timing_s/stop_profile:6.890669465065002e-05 - timing_per_token_ms/adv:6.257451708876627e-05 - timing_per_token_ms/update_actor:0.01965360171337733 - timing_per_token_ms/gen:0.04161286920062012 - timing_per_token_ms/ref:0.021813354921204575 - perf/total_num_tokens:1866432 - perf/time_per_step:123.51663697883487 - perf/throughput:3777.693527066766 - frontier/active_count:24.0 - frontier/completed_count:40.0 - frontier/blacklisted_count:1939.0 - frontier/mean_score:2.3940295699718503 - frontier/mean_frontier_pct:0.7206552268913025 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:128.0 - frontier/cluster_2/score:2.033562335386489 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.4440375487592996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:176.0 - frontier/cluster_6/score:1.2677952497273457 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:240.0 - frontier/cluster_8/score:2.168135000838779 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.5373708382473135 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:160.0 - frontier/cluster_11/score:2.2731098614010397 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.287891436810656 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:2.853613694860127 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:2.5791402586135095 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:3.012200953430056 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:1.5163598055363952 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.9138656376299994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:144.0 - frontier/cluster_27/score:2.053969787998019 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:224.0 - frontier/cluster_32/score:1.7357445923180224 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.445819652490057 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:3.1113222474665982 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.907698667401039 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:2.6891178670259896 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:176.0 - frontier/cluster_45/score:2.357420321698296 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:192.0 - frontier/cluster_48/score:1.6169522772830336 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.9911624626937456 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.1289729969509095 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.5452269336656363 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:160.0 - frontier/cluster_63/score:1.9862192510920538 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:285.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.03539294795570689 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.04253702591742349 - cluster/prob_snapshot/cluster_6:0.022065225398445628 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.037735105489672245 - cluster/prob_snapshot/cluster_9:0.044161436539071036 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03956213076049167 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.039819395325276445 - cluster/prob_snapshot/cluster_16:0.049665456145794394 - cluster/prob_snapshot/cluster_17:0.044888408560255645 - cluster/prob_snapshot/cluster_18:0.052425573448978496 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.026391344265960494 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.05071410552210134 - cluster/prob_snapshot/cluster_27:0.0357481275809487 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.030209606536912847 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.04256804237730613 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.054150720861521874 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.050606773058001334 - cluster/prob_snapshot/cluster_44:0.046802503694249285 - cluster/prob_snapshot/cluster_45:0.04102950438435226 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.028142096655160327 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.05205941097894272 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.054457921701646955 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.04429816722661247 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.03456896961516729
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▌      | 286/800 [9:31:35<18:50:26, 131.96s/it]
[36m(TaskRunner pid=2823680)[0m step:286 - global_seqlen/min:410770 - global_seqlen/max:565480 - global_seqlen/minmax_diff:154710 - global_seqlen/balanced_min:481981 - global_seqlen/balanced_max:482075 - global_seqlen/mean:482022.0 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.16949576997042945) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008932839147746563 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.01941103938224842) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.002139854305899765) - actor/ppo_kl:np.float64(0.0007684278809042174) - actor/pg_clipfrac_lower:np.float64(0.0002708273733939374) - actor/grad_norm:np.float64(0.31338414674003917) - perf/mfu/actor:np.float64(0.23365693609282479) - perf/max_memory_allocated_gb:np.float64(93.24382066726685) - perf/max_memory_reserved_gb:np.float64(99.0078125) - perf/cpu_memory_used_gb:np.float64(114.35267543792725) - actor/lr:np.float64(1e-06) - training/global_step:286 - training/epoch:0 - critic/score/mean:0.4830729067325592 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5026686787605286 - critic/rewards/max:1.2489651441574097 - critic/rewards/min:-0.1036263108253479 - critic/advantages/mean:-0.04888171702623367 - critic/advantages/max:2.4746713638305664 - critic/advantages/min:-2.4746170043945312 - critic/returns/mean:-0.04888171702623367 - critic/returns/max:2.4746713638305664 - critic/returns/min:-2.4746170043945312 - response_length/mean:1547.6849365234375 - response_length/max:8192.0 - response_length/min:144.0 - response_length/clip_ratio:0.0481770820915699 - response_length_non_aborted/mean:1547.6849365234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:144.0 - response_length_non_aborted/clip_ratio:0.0481770820915699 - response/aborted_ratio:0.0 - prompt_length/mean:241.8541717529297 - prompt_length/max:500.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.495897054672241e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1298709521070123) - timing_s/agent_loop/generate_sequences/max:np.float64(36.75543455965817) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.850110325354763) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(36.75543455965817) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:185 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:39.02747582178563 - timing_s/reward:0.000125199556350708 - timing_s/old_log_prob:11.291528983972967 - timing_s/ref:27.828916295431554 - timing_s/adv:0.07999858912080526 - timing_s/update_actor:24.716570458374918 - timing_s/update_weights:33.921258515678346 - timing_s/step:137.2634171191603 - timing_s/stop_profile:5.6704506278038025e-05 - timing_per_token_ms/adv:5.82076310973971e-05 - timing_per_token_ms/update_actor:0.017983979855711592 - timing_per_token_ms/gen:0.032834219644079975 - timing_per_token_ms/ref:0.020248548272753804 - perf/total_num_tokens:1928088 - perf/time_per_step:137.2634171191603 - perf/throughput:3511.656711719117 - frontier/active_count:24.0 - frontier/completed_count:40.0 - frontier/blacklisted_count:1970.0 - frontier/mean_score:2.383001592453224 - frontier/mean_frontier_pct:0.7535765388454019 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:144.0 - frontier/cluster_2/score:1.7234936347705423 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.6108262841315097 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:176.0 - frontier/cluster_6/score:1.2677952497273457 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:240.0 - frontier/cluster_8/score:2.417694500587145 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.5373708382473135 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:1.8911769029807277 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:1.9015240057674594 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:2.8975295864020887 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:2.5791402586135095 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:3.012200953430056 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:1.9614518638754765 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.9397059463409994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:1.7377788515986134 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:224.0 - frontier/cluster_32/score:2.1150212146226153 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.445819652490057 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:3.0779255732266186 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:3.5353890671807275 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:2.1823825069181924 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:192.0 - frontier/cluster_45/score:2.550194225188807 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:192.0 - frontier/cluster_48/score:1.6169522772830336 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:2.993813723885622 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.1289729969509095 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.081658853565945 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:160.0 - frontier/cluster_63/score:1.9862192510920538 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:286.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.030135202179272375 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.04565017029362934 - cluster/prob_snapshot/cluster_6:0.022167338133245347 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.04227327047402093 - cluster/prob_snapshot/cluster_9:0.04436580540348366 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0330671359489424 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.033248054536720875 - cluster/prob_snapshot/cluster_16:0.050663163556316504 - cluster/prob_snapshot/cluster_17:0.0450961416822213 - cluster/prob_snapshot/cluster_18:0.05266818681828022 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03429589021410432 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.051400615153643726 - cluster/prob_snapshot/cluster_27:0.030384978499070608 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.036981042825022285 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.042765037383870774 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.053817378591181034 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.061816105480461826 - cluster/prob_snapshot/cluster_44:0.038158851736776425 - cluster/prob_snapshot/cluster_45:0.044590021698983 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.028272331737764965 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.05234668700611992 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.05470994030630875 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.03639770356844621 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.03472894677211316
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 21:04:00,242:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▌      | 287/800 [9:33:47<18:46:58, 131.81s/it]
[36m(TaskRunner pid=2823680)[0m step:287 - global_seqlen/min:379696 - global_seqlen/max:517268 - global_seqlen/minmax_diff:137572 - global_seqlen/balanced_min:450815 - global_seqlen/balanced_max:450989 - global_seqlen/mean:450869.5 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.19155946245417) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010350244119763374 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.018024111748673022) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009008599512162618) - actor/ppo_kl:np.float64(0.00045481015740506337) - actor/pg_clipfrac_lower:np.float64(8.966707337094704e-06) - actor/grad_norm:np.float64(0.36085350123735577) - perf/mfu/actor:np.float64(0.22937239818903304) - perf/max_memory_allocated_gb:np.float64(93.24382066726685) - perf/max_memory_reserved_gb:np.float64(99.0078125) - perf/cpu_memory_used_gb:np.float64(113.45679664611816) - actor/lr:np.float64(1e-06) - training/global_step:287 - training/epoch:0 - critic/score/mean:0.5265151262283325 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5377935767173767 - critic/rewards/max:1.2083375453948975 - critic/rewards/min:-0.08146936446428299 - critic/advantages/mean:-0.08781310170888901 - critic/advantages/max:2.474782705307007 - critic/advantages/min:-2.4748475551605225 - critic/returns/mean:-0.08781310170888901 - critic/returns/max:2.474782705307007 - critic/returns/min:-2.4748475551605225 - response_length/mean:1410.53662109375 - response_length/max:8192.0 - response_length/min:136.0 - response_length/clip_ratio:0.04292929172515869 - response_length_non_aborted/mean:1410.53662109375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:136.0 - response_length_non_aborted/clip_ratio:0.04292929172515869 - response/aborted_ratio:0.0 - prompt_length/mean:239.4242401123047 - prompt_length/max:456.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.931942284107208e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4497778182849288) - timing_s/agent_loop/generate_sequences/max:np.float64(32.94075718522072) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.3330132142828) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.94075718522072) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:201 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.131408989429474 - timing_s/reward:0.00024512410163879395 - timing_s/old_log_prob:11.546017749235034 - timing_s/ref:24.894380128011107 - timing_s/adv:0.09664350375533104 - timing_s/update_actor:23.538778744637966 - timing_s/update_weights:35.60709248110652 - timing_s/step:131.21175196487457 - timing_s/stop_profile:6.200931966304779e-05 - timing_per_token_ms/adv:7.395607315090198e-05 - timing_per_token_ms/update_actor:0.018012960779325165 - timing_per_token_ms/gen:0.031447492482559986 - timing_per_token_ms/ref:0.019050329574707623 - perf/total_num_tokens:1803478 - perf/time_per_step:131.21175196487457 - perf/throughput:3436.1975451764256 - frontier/active_count:19.0 - frontier/completed_count:45.0 - frontier/blacklisted_count:1997.0 - frontier/mean_score:2.4293300776775943 - frontier/mean_frontier_pct:0.7344345104405611 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:160.0 - frontier/cluster_2/score:1.5064455443393796 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.6108262841315097 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:1.9015240057674594 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:2.7053981810294565 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:3.008540667401039 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:1.9614518638754765 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.957794162438699 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:1.7377788515986134 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:224.0 - frontier/cluster_32/score:2.1150212146226153 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.6120737567430394 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:3.0779255732266186 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:3.374772347026509 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:2.4276677548427346 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:192.0 - frontier/cluster_45/score:2.6851359576321645 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:192.0 - frontier/cluster_48/score:1.6169522772830336 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:176.0 - frontier/cluster_51/score:2.395669606719935 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.0902810978656365 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.081658853565945 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:176.0 - frontier/cluster_63/score:2.2903534757644373 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:287.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.03263723127842979 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.056563704929918766 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.041196629370983444 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.05861261063586757 - cluster/prob_snapshot/cluster_18:0.06518021042412694 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.04249496993990855 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.06408078441085265 - cluster/prob_snapshot/cluster_27:0.037649080979730874 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.045822058951818775 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.05659073149738348 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.06668343848781016 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.073114641293094 - cluster/prob_snapshot/cluster_44:0.05259556462542721 - cluster/prob_snapshot/cluster_45:0.05817362837479777 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0350313661440796 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.05190232286525245 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06695112165546613 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0450992614382329 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.049620642696819074
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 21:06:07,247:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▌      | 288/800 [9:35:52<18:29:29, 130.02s/it]
[36m(TaskRunner pid=2823680)[0m step:288 - global_seqlen/min:446701 - global_seqlen/max:506872 - global_seqlen/minmax_diff:60171 - global_seqlen/balanced_min:467805 - global_seqlen/balanced_max:467934 - global_seqlen/mean:467853.0 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.19802079172155193) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008347069844603539 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0174812093609944) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011672898986563628) - actor/ppo_kl:np.float64(-0.002451670862794429) - actor/pg_clipfrac_lower:np.float64(0.00018079188051816283) - actor/grad_norm:np.float64(0.35431290892037476) - perf/mfu/actor:np.float64(0.26690816687307656) - perf/max_memory_allocated_gb:np.float64(93.24382066726685) - perf/max_memory_reserved_gb:np.float64(99.0078125) - perf/cpu_memory_used_gb:np.float64(112.84388732910156) - actor/lr:np.float64(1e-06) - training/global_step:288 - training/epoch:0 - critic/score/mean:0.5632352828979492 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.577440619468689 - critic/rewards/max:1.7872861623764038 - critic/rewards/min:-0.06359679251909256 - critic/advantages/mean:0.008068575523793697 - critic/advantages/max:2.4745469093322754 - critic/advantages/min:-2.4748237133026123 - critic/returns/mean:0.008068575523793697 - critic/returns/max:2.4745469093322754 - critic/returns/min:-2.4748237133026123 - response_length/mean:1365.3970947265625 - response_length/max:8192.0 - response_length/min:222.0 - response_length/clip_ratio:0.036764707416296005 - response_length_non_aborted/mean:1365.3970947265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:222.0 - response_length_non_aborted/clip_ratio:0.036764707416296005 - response/aborted_ratio:0.0 - prompt_length/mean:242.11764526367188 - prompt_length/max:678.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.694734424352646e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7163775181397796) - timing_s/agent_loop/generate_sequences/max:np.float64(34.468330113217235) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.341169681145402) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.468330113217235) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:223 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.76478260755539 - timing_s/reward:0.00016026757657527924 - timing_s/old_log_prob:11.84459913149476 - timing_s/ref:23.24836882855743 - timing_s/adv:0.08505517616868019 - timing_s/update_actor:21.69736985117197 - timing_s/update_weights:31.604210953228176 - timing_s/step:125.61122076399624 - timing_s/stop_profile:7.00000673532486e-05 - timing_per_token_ms/adv:7.781026261646147e-05 - timing_per_token_ms/update_actor:0.019849209915902305 - timing_per_token_ms/gen:0.03959716803726064 - timing_per_token_ms/ref:0.021268096375074267 - perf/total_num_tokens:1871412 - perf/time_per_step:125.61122076399624 - perf/throughput:3724.6115208053134 - frontier/active_count:18.0 - frontier/completed_count:46.0 - frontier/blacklisted_count:2038.0 - frontier/mean_score:2.4165814246722137 - frontier/mean_frontier_pct:0.766664365713058 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:160.0 - frontier/cluster_2/score:1.5064455443393796 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:144.0 - frontier/cluster_5/score:2.1275783988920565 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.2310668040372215 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:160.0 - frontier/cluster_17/score:2.7937787267206193 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:3.605978467180727 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:224.0 - frontier/cluster_23/score:2.2730163047128333 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:128.0 - frontier/cluster_26/score:2.9704559137070894 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:2.116445196119029 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:240.0 - frontier/cluster_32/score:1.7805148502358306 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:2.7284516297201273 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:208.0 - frontier/cluster_39/score:3.0779255732266186 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:3.374772347026509 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:208.0 - frontier/cluster_45/score:2.179595170342515 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:208.0 - frontier/cluster_48/score:1.4318665940981234 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:1.9769687247039545 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:3.063196768505945 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.3571611974961613 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:192.0 - frontier/cluster_63/score:1.9032474330351061 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:288.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.034632153618129165 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.04891157348628554 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.05129070120062602 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.06422706376769795 - cluster/prob_snapshot/cluster_18:0.08289898077519531 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.05225509155450283 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.06828875156220605 - cluster/prob_snapshot/cluster_27:0.04865562876253096 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.040932819672395516 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.06272523844965129 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.07075940559397892 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0775837100701106 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0501074035156952 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.032917634516433626 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.04544915999748859 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.07042080043854233 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0541895251382479 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.04375435788028224
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▌      | 289/800 [9:38:03<18:27:30, 130.04s/it]
[36m(TaskRunner pid=2823680)[0m step:289 - global_seqlen/min:416950 - global_seqlen/max:531900 - global_seqlen/minmax_diff:114950 - global_seqlen/balanced_min:498940 - global_seqlen/balanced_max:499070 - global_seqlen/mean:498993.75 - frontier/skipped_zero_acc_count:42.0 - actor/entropy:np.float64(0.1733918198882494) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010802067816257477 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.02077929693768965) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0005909322185183301) - actor/ppo_kl:np.float64(0.0001976608120278184) - actor/pg_clipfrac_lower:np.float64(3.7065774238218966e-06) - actor/grad_norm:np.float64(0.3303786949677901) - perf/mfu/actor:np.float64(0.28851635209881826) - perf/max_memory_allocated_gb:np.float64(93.24382066726685) - perf/max_memory_reserved_gb:np.float64(99.0078125) - perf/cpu_memory_used_gb:np.float64(113.16878700256348) - actor/lr:np.float64(1e-06) - training/global_step:289 - training/epoch:0 - critic/score/mean:0.5566860437393188 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5671011209487915 - critic/rewards/max:1.279269814491272 - critic/rewards/min:-0.06280231475830078 - critic/advantages/mean:-0.0884467139840126 - critic/advantages/max:2.474827289581299 - critic/advantages/min:-2.4748120307922363 - critic/returns/mean:-0.0884467139840126 - critic/returns/max:2.474827289581299 - critic/returns/min:-2.4748120307922363 - response_length/mean:1499.9259033203125 - response_length/max:8192.0 - response_length/min:119.0 - response_length/clip_ratio:0.04360464960336685 - response_length_non_aborted/mean:1499.9259033203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:119.0 - response_length_non_aborted/clip_ratio:0.04360464960336685 - response/aborted_ratio:0.0 - prompt_length/mean:240.58139038085938 - prompt_length/max:619.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.931158274412155e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2547659119591117) - timing_s/agent_loop/generate_sequences/max:np.float64(36.61396942380816) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.243156138444647) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(36.61396942380816) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:196 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:38.14851439744234 - timing_s/reward:0.0002796407788991928 - timing_s/old_log_prob:11.566071404144168 - timing_s/ref:25.951624607667327 - timing_s/adv:0.09430790971964598 - timing_s/update_actor:20.8675091303885 - timing_s/update_weights:32.81135607790202 - timing_s/step:129.85654294397682 - timing_s/stop_profile:6.583891808986664e-05 - timing_per_token_ms/adv:7.87560343688613e-05 - timing_per_token_ms/update_actor:0.017426346010116753 - timing_per_token_ms/gen:0.03696744160558549 - timing_per_token_ms/ref:0.021672063834360076 - perf/total_num_tokens:1995975 - perf/time_per_step:129.85654294397682 - perf/throughput:3842.653890880783 - frontier/active_count:16.0 - frontier/completed_count:48.0 - frontier/blacklisted_count:2079.0 - frontier/mean_score:2.3077557469369845 - frontier/mean_frontier_pct:0.7846142510408027 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:5.0 - frontier/batch_hard_count:9.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:176.0 - frontier/cluster_2/score:1.3545118810375656 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:1.7893048792244395 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:176.0 - frontier/cluster_17/score:2.2556451087044334 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:4.024184927026509 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:128.0 - frontier/cluster_26/score:2.9793191395949625 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:2.116445196119029 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:240.0 - frontier/cluster_32/score:2.146360395165081 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:192.0 - frontier/cluster_37/score:2.209916140804089 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:224.0 - frontier/cluster_39/score:2.454547901258633 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:3.262340642918556 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:224.0 - frontier/cluster_45/score:1.8257166192397605 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:208.0 - frontier/cluster_48/score:1.9023066158686863 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:1.9769687247039545 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:3.044237737954161 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:1.9500128382473128 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:208.0 - frontier/cluster_63/score:1.6322732031245744 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:289.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.03668368833105954 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.04845900832441135 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.061088709011403286 - cluster/prob_snapshot/cluster_18:0.10898534572949525 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.08068767523245593 - cluster/prob_snapshot/cluster_27:0.05731881501454725 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.058128996049892884 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.05985025017642262 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.06647551155804945 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.08835263023525572 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.04944513250760452 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.05151938789432009 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.05354143108862166 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.08244583893883395 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.05281139589933604 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.04420618400829036
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▋      | 290/800 [9:40:23<18:50:39, 133.02s/it]
[36m(TaskRunner pid=2823680)[0m step:290 - global_seqlen/min:389674 - global_seqlen/max:642138 - global_seqlen/minmax_diff:252464 - global_seqlen/balanced_min:492502 - global_seqlen/balanced_max:492621 - global_seqlen/mean:492534.5 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.19929552044560936) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0073579344898462296 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.009022619793540798) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007683175050608933) - actor/ppo_kl:np.float64(0.001609082118171558) - actor/pg_clipfrac_lower:np.float64(8.685999014136194e-06) - actor/grad_norm:np.float64(0.3670586583515008) - perf/mfu/actor:np.float64(0.24817951556820939) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(100.7109375) - perf/cpu_memory_used_gb:np.float64(113.69170761108398) - actor/lr:np.float64(1e-06) - training/global_step:290 - training/epoch:0 - critic/score/mean:0.5161290168762207 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.543023943901062 - critic/rewards/max:1.5113722085952759 - critic/rewards/min:-0.10844779759645462 - critic/advantages/mean:-0.05375168099999428 - critic/advantages/max:2.474367618560791 - critic/advantages/min:-2.474745988845825 - critic/returns/mean:-0.05375168099999428 - critic/returns/max:2.474367618560791 - critic/returns/min:-2.474745988845825 - response_length/mean:1623.1531982421875 - response_length/max:8192.0 - response_length/min:218.0 - response_length/clip_ratio:0.05510752648115158 - response_length_non_aborted/mean:1623.1531982421875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:218.0 - response_length_non_aborted/clip_ratio:0.05510752648115158 - response/aborted_ratio:0.0 - prompt_length/mean:247.77420043945312 - prompt_length/max:817.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.56319272518158e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.428263264708221) - timing_s/agent_loop/generate_sequences/max:np.float64(35.609358897432685) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.699222294978426) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.609358897432685) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:205 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.93114748504013 - timing_s/reward:0.00016767531633377075 - timing_s/old_log_prob:11.735256058163941 - timing_s/ref:28.758029744960368 - timing_s/adv:0.08887937106192112 - timing_s/update_actor:23.885522830300033 - timing_s/update_weights:36.86233932059258 - timing_s/step:139.72367482818663 - timing_s/stop_profile:7.469207048416138e-05 - timing_per_token_ms/adv:6.385149899920338e-05 - timing_per_token_ms/update_actor:0.01715950978131715 - timing_per_token_ms/gen:0.031409681047807954 - timing_per_token_ms/ref:0.020659949384656543 - perf/total_num_tokens:1970138 - perf/time_per_step:139.72367482818663 - perf/throughput:3525.0611652295333 - frontier/active_count:14.0 - frontier/completed_count:50.0 - frontier/blacklisted_count:2114.0 - frontier/mean_score:2.372994388118823 - frontier/mean_frontier_pct:0.8072398201498027 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:176.0 - frontier/cluster_2/score:1.8481583167262956 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:2.1525134154571077 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:192.0 - frontier/cluster_17/score:1.8789515760931033 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:176.0 - frontier/cluster_18/score:4.316929448918556 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.9855233977164737 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:176.0 - frontier/cluster_27/score:1.7815116372833204 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:256.0 - frontier/cluster_32/score:2.4024522766155565 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:240.0 - frontier/cluster_39/score:2.0181835308810427 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.583638450042989 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:224.0 - frontier/cluster_48/score:2.23161463110808 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:2.283878107292768 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:3.0309664165679124 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.2650089867731187 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:224.0 - frontier/cluster_63/score:1.442591242187202 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:290.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.05563068711774059 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0647919603252081 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0565575829153932 - cluster/prob_snapshot/cluster_18:0.129942196676928 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.08986606640672098 - cluster/prob_snapshot/cluster_27:0.053624581613697035 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.07231527175249923 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.06074854926470425 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.07776908554798422 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.0671729549286936 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0687461172844305 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.09123392885688594 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.06817814530372111 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.04342287200539326
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▋      | 291/800 [9:42:42<19:05:32, 135.04s/it]
[36m(TaskRunner pid=2823680)[0m step:291 - global_seqlen/min:425956 - global_seqlen/max:597708 - global_seqlen/minmax_diff:171752 - global_seqlen/balanced_min:498254 - global_seqlen/balanced_max:498391 - global_seqlen/mean:498337.0 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.19504185552553585) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012194332666695118 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02604290558883804) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0030753261476850944) - actor/ppo_kl:np.float64(0.005007822490810554) - actor/pg_clipfrac_lower:np.float64(0.00030176247973183007) - actor/grad_norm:np.float64(0.4085046884914239) - perf/mfu/actor:np.float64(0.2474350222544112) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.95645141601562) - actor/lr:np.float64(1e-06) - training/global_step:291 - training/epoch:0 - critic/score/mean:0.5716145634651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5990859866142273 - critic/rewards/max:1.7363436222076416 - critic/rewards/min:-0.1349513679742813 - critic/advantages/mean:0.0015074179973453283 - critic/advantages/max:2.474684000015259 - critic/advantages/min:-2.4746527671813965 - critic/returns/mean:0.0015074179973453283 - critic/returns/max:2.474684000015259 - critic/returns/min:-2.4746527671813965 - response_length/mean:1607.53515625 - response_length/max:8192.0 - response_length/min:123.0 - response_length/clip_ratio:0.046875 - response_length_non_aborted/mean:1607.53515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:123.0 - response_length_non_aborted/clip_ratio:0.046875 - response/aborted_ratio:0.0 - prompt_length/mean:238.1145782470703 - prompt_length/max:470.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.476525545120239e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0135404849424958) - timing_s/agent_loop/generate_sequences/max:np.float64(34.310570327565074) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.861594491020696) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.310570327565074) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:203 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.46923299692571 - timing_s/reward:0.00026099756360054016 - timing_s/old_log_prob:11.494362089782953 - timing_s/ref:29.9234421774745 - timing_s/adv:0.07309294957667589 - timing_s/update_actor:24.291937520727515 - timing_s/update_weights:36.783242852427065 - timing_s/step:139.4834034331143 - timing_s/stop_profile:6.076879799365997e-05 - timing_per_token_ms/adv:5.156618256801494e-05 - timing_per_token_ms/update_actor:0.017137665019395634 - timing_per_token_ms/gen:0.02953962174956136 - timing_per_token_ms/ref:0.021110622725224855 - perf/total_num_tokens:1993348 - perf/time_per_step:139.4834034331143 - perf/throughput:3572.733298259135 - frontier/active_count:14.0 - frontier/completed_count:50.0 - frontier/blacklisted_count:2146.0 - frontier/mean_score:2.4325246431117473 - frontier/mean_frontier_pct:0.8405748494813384 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:16.0 - frontier/replay_pool_size:5233.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:192.0 - frontier/cluster_2/score:2.1937108217084065 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:176.0 - frontier/cluster_5/score:2.406759390819975 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:192.0 - frontier/cluster_17/score:2.215266103265172 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:176.0 - frontier/cluster_18/score:3.9218506142429885 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.9898663784015316 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:176.0 - frontier/cluster_27/score:2.147058146098324 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:256.0 - frontier/cluster_32/score:2.5817165936308895 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:240.0 - frontier/cluster_39/score:2.3127284716167296 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.7085469150300923 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:240.0 - frontier/cluster_48/score:1.862130241775656 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:2.4987146751049374 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.4216764915975384 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:2.485506290741183 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:240.0 - frontier/cluster_63/score:1.3098138695310415 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:291.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0644160504460841 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.07067200143084933 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.06504899900539277 - cluster/prob_snapshot/cluster_18:0.11516108892253188 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.0877943353117871 - cluster/prob_snapshot/cluster_27:0.0630461428558013 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.07580943882261858 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.06791088069654452 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.07953368009475745 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.054679529500604174 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0733721732915466 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.07111002667405276 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.072984322739383 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.03846133020804657
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 21:15:06,762:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  36%|███▋      | 292/800 [9:44:49<18:43:23, 132.68s/it]
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 21:15:06,912:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:292 - global_seqlen/min:397825 - global_seqlen/max:616139 - global_seqlen/minmax_diff:218314 - global_seqlen/balanced_min:506320 - global_seqlen/balanced_max:506474 - global_seqlen/mean:506391.0 - frontier/skipped_zero_acc_count:25.0 - actor/entropy:np.float64(0.1858010636821676) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.003307079430669546 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.013494869548594579) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012587705202279792) - actor/ppo_kl:np.float64(-0.00023101829261744472) - actor/pg_clipfrac_lower:np.float64(6.359812288372687e-05) - actor/grad_norm:np.float64(0.5231796044569749) - perf/mfu/actor:np.float64(0.2310098399732137) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.94625854492188) - actor/lr:np.float64(1e-06) - training/global_step:292 - training/epoch:0 - critic/score/mean:0.6007281541824341 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6333911418914795 - critic/rewards/max:1.6787996292114258 - critic/rewards/min:-0.07440564036369324 - critic/advantages/mean:-0.021012363955378532 - critic/advantages/max:2.4746358394622803 - critic/advantages/min:-2.47481107711792 - critic/returns/mean:-0.021012363955378532 - critic/returns/max:2.4746358394622803 - critic/returns/min:-2.47481107711792 - response_length/mean:1654.8192138671875 - response_length/max:8192.0 - response_length/min:136.0 - response_length/clip_ratio:0.05946601927280426 - response_length_non_aborted/mean:1654.8192138671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:136.0 - response_length_non_aborted/clip_ratio:0.05946601927280426 - response/aborted_ratio:0.0 - prompt_length/mean:250.0970916748047 - prompt_length/max:817.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.669495582580566e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0692186197265983) - timing_s/agent_loop/generate_sequences/max:np.float64(36.3927387567237) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.196245877918955) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(36.3927387567237) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:38.01397512387484 - timing_s/reward:0.0001560170203447342 - timing_s/old_log_prob:11.72750000283122 - timing_s/ref:21.055411783978343 - timing_s/adv:0.10363826248794794 - timing_s/update_actor:26.449345925822854 - timing_s/update_weights:29.21643902733922 - timing_s/step:126.9791208030656 - timing_s/stop_profile:0.00011394359171390533 - timing_per_token_ms/adv:6.602630934389106e-05 - timing_per_token_ms/update_actor:0.016850462890045527 - timing_per_token_ms/gen:0.0278782513883581 - timing_per_token_ms/ref:0.013414072162524245 - perf/total_num_tokens:2025564 - perf/time_per_step:126.9791208030656 - perf/throughput:3987.9863460810357 - frontier/active_count:11.0 - frontier/completed_count:53.0 - frontier/blacklisted_count:2169.0 - frontier/mean_score:2.5601766985941974 - frontier/mean_frontier_pct:0.8379601878346495 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:16.0 - frontier/replay_pool_size:5228.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:192.0 - frontier/cluster_2/score:2.4355975751958843 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:192.0 - frontier/cluster_5/score:1.9847315735739826 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.245295429970092 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:2.392906464881072 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:192.0 - frontier/cluster_27/score:2.4029407022688267 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:2.7959828405210643 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:240.0 - frontier/cluster_48/score:2.203491169242959 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:2.649100272573456 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.5951735441182766 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:2.639854403518828 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:240.0 - frontier/cluster_63/score:1.816869708671729 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:292.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.08648542169102058 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0704756601961322 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.1507458248452219 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.08496950678141672 - cluster/prob_snapshot/cluster_27:0.08532581164091635 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.09928231061893462 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.07824357558292058 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.09406667033526124 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.09215179084188342 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.09373835957808488 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.06451506788820756
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 21:17:15,982:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  37%|███▋      | 293/800 [9:47:00<18:35:08, 131.97s/it]
[36m(TaskRunner pid=2823680)[0m step:293 - global_seqlen/min:405766 - global_seqlen/max:564688 - global_seqlen/minmax_diff:158922 - global_seqlen/balanced_min:475009 - global_seqlen/balanced_max:475269 - global_seqlen/mean:475124.5 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.18587377610407313) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014830970205366611 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06331005421816371) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0013587235957764241) - actor/ppo_kl:np.float64(-0.0009005849048276337) - actor/pg_clipfrac_lower:np.float64(4.949408237883531e-05) - actor/grad_norm:np.float64(0.44688100998218244) - perf/mfu/actor:np.float64(0.24986337503812897) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(114.07296752929688) - actor/lr:np.float64(1e-06) - training/global_step:293 - training/epoch:0 - critic/score/mean:0.6327319741249084 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6472248435020447 - critic/rewards/max:1.443894386291504 - critic/rewards/min:-0.10314486920833588 - critic/advantages/mean:-0.08914123475551605 - critic/advantages/max:2.4747822284698486 - critic/advantages/min:-2.474738359451294 - critic/returns/mean:-0.08914123475551605 - critic/returns/max:2.4747822284698486 - critic/returns/min:-2.474738359451294 - response_length/mean:1417.777099609375 - response_length/max:8192.0 - response_length/min:199.0 - response_length/clip_ratio:0.04639175161719322 - response_length_non_aborted/mean:1417.777099609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:199.0 - response_length_non_aborted/clip_ratio:0.04639175161719322 - response/aborted_ratio:0.0 - prompt_length/mean:264.55670166015625 - prompt_length/max:886.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.431689977645874e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5639103073626757) - timing_s/agent_loop/generate_sequences/max:np.float64(35.38363036699593) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.358812385653437) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.38363036699593) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.63090204074979 - timing_s/reward:0.00024149753153324127 - timing_s/old_log_prob:11.516419734805822 - timing_s/ref:25.42182646691799 - timing_s/adv:0.11379730235785246 - timing_s/update_actor:23.506870112381876 - timing_s/update_weights:31.513183281756938 - timing_s/step:130.09831158723682 - timing_s/stop_profile:6.676279008388519e-05 - timing_per_token_ms/adv:8.71682013570775e-05 - timing_per_token_ms/update_actor:0.01800615256051698 - timing_per_token_ms/gen:0.0342038475368001 - timing_per_token_ms/ref:0.019473000171520136 - perf/total_num_tokens:1900498 - perf/time_per_step:130.09831158723682 - perf/throughput:3652.0420150219047 - frontier/active_count:9.0 - frontier/completed_count:55.0 - frontier/blacklisted_count:2198.0 - frontier/mean_score:2.628230231748531 - frontier/mean_frontier_pct:0.8517168833786558 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:40.0 - frontier/replay_pool_size:5777.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:208.0 - frontier/cluster_5/score:1.6893121015017878 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:208.0 - frontier/cluster_18/score:4.471706800979064 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:2.57503452541675 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:192.0 - frontier/cluster_27/score:2.5820584915881786 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:2.857187988364745 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:256.0 - frontier/cluster_48/score:2.442443818470071 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.716621480882793 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.7478980824631796 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:256.0 - frontier/cluster_63/score:1.5718087960702103 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:293.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.07141739043403143 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.18904596150594585 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.10886220842158827 - cluster/prob_snapshot/cluster_27:0.10915915374863255 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.12079053357107197 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.1032568011806663 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.11484794123549216 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.11617019143693828 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.06644981846563319
[36m(TaskRunner pid=2823680)[0m Training Progress:  37%|███▋      | 294/800 [9:49:03<18:11:47, 129.46s/it]
[36m(TaskRunner pid=2823680)[0m step:294 - global_seqlen/min:420130 - global_seqlen/max:463161 - global_seqlen/minmax_diff:43031 - global_seqlen/balanced_min:437606 - global_seqlen/balanced_max:437799 - global_seqlen/mean:437684.0 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.17751550788657283) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.015304060652852058 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07111051790798228) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0017714647884823312) - actor/ppo_kl:np.float64(-0.0007822756045410608) - actor/pg_clipfrac_lower:np.float64(0.00016927221138689246) - actor/grad_norm:np.float64(0.7366638428398541) - perf/mfu/actor:np.float64(0.17287086391237427) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(114.21381759643555) - actor/lr:np.float64(1e-06) - training/global_step:294 - training/epoch:0 - critic/score/mean:0.6452381014823914 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6607300043106079 - critic/rewards/max:1.8842270374298096 - critic/rewards/min:-0.08798715472221375 - critic/advantages/mean:-0.04509960487484932 - critic/advantages/max:2.474379301071167 - critic/advantages/min:-2.474612236022949 - critic/returns/mean:-0.04509960487484932 - critic/returns/max:2.474379301071167 - critic/returns/min:-2.474612236022949 - response_length/mean:1379.103515625 - response_length/max:8192.0 - response_length/min:151.0 - response_length/clip_ratio:0.04047619178891182 - response_length_non_aborted/mean:1379.103515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:151.0 - response_length_non_aborted/clip_ratio:0.04047619178891182 - response/aborted_ratio:0.0 - prompt_length/mean:237.2857208251953 - prompt_length/max:962.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.724629878997803e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5360984895378351) - timing_s/agent_loop/generate_sequences/max:np.float64(32.88354311697185) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.890870831702159) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.88354311697185) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:192 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.11332172062248 - timing_s/reward:0.00020172540098428726 - timing_s/old_log_prob:12.462468892335892 - timing_s/ref:18.68966485466808 - timing_s/adv:0.1401371480897069 - timing_s/update_actor:30.261317249387503 - timing_s/update_weights:26.224147996865213 - timing_s/step:123.38394204899669 - timing_s/stop_profile:5.8414414525032043e-05 - timing_per_token_ms/adv:0.00010321148480535092 - timing_per_token_ms/update_actor:0.022287562777256704 - timing_per_token_ms/gen:0.030310684667164297 - timing_per_token_ms/ref:0.013765001546412662 - perf/total_num_tokens:1750736 - perf/time_per_step:123.38394204899669 - perf/throughput:3547.3335730041144 - frontier/active_count:8.0 - frontier/completed_count:56.0 - frontier/blacklisted_count:2220.0 - frontier/mean_score:2.5784569624163 - frontier/mean_frontier_pct:0.8661598441440528 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:0.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:56.0 - frontier/replay_pool_size:6076.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:208.0 - frontier/cluster_5/score:2.082518471051251 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:176.0 - frontier/cluster_26/score:2.702524167791725 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:2.707440944111725 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:208.0 - frontier/cluster_43/score:2.900031591855321 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:256.0 - frontier/cluster_48/score:2.6097106729290496 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.801635036617955 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.823528657724226 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:256.0 - frontier/cluster_63/score:2.000266157249147 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:294.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.10095759311703328 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.13101460520690444 - cluster/prob_snapshot/cluster_27:0.13125296367049658 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.14058948986381714 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.12651513632805902 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.1358193619989934 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.13688073423757413 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.09697011557712193
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 21:21:31,990:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  37%|███▋      | 295/800 [9:51:21<18:31:14, 132.03s/it]
[36m(TaskRunner pid=2823680)[0m step:295 - global_seqlen/min:444250 - global_seqlen/max:554444 - global_seqlen/minmax_diff:110194 - global_seqlen/balanced_min:490338 - global_seqlen/balanced_max:490611 - global_seqlen/mean:490497.5 - frontier/skipped_zero_acc_count:25.0 - actor/entropy:np.float64(0.16673137731133744) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013347630389034748 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.009669928921084647) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.004459381811242425) - actor/ppo_kl:np.float64(0.006467773972900659) - actor/pg_clipfrac_lower:np.float64(0.0004938175229304201) - actor/grad_norm:np.float64(0.44723227620124817) - perf/mfu/actor:np.float64(0.24235653441526372) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.95421981811523) - actor/lr:np.float64(1e-06) - training/global_step:295 - training/epoch:0 - critic/score/mean:0.5473300814628601 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5741045475006104 - critic/rewards/max:1.3687301874160767 - critic/rewards/min:-0.11061841249465942 - critic/advantages/mean:0.0167570561170578 - critic/advantages/max:2.474562644958496 - critic/advantages/min:-2.4747140407562256 - critic/returns/mean:0.0167570561170578 - critic/returns/max:2.474562644958496 - critic/returns/min:-2.4747140407562256 - response_length/mean:1558.40771484375 - response_length/max:8192.0 - response_length/min:157.0 - response_length/clip_ratio:0.05461164936423302 - response_length_non_aborted/mean:1558.40771484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:157.0 - response_length_non_aborted/clip_ratio:0.05461164936423302 - response/aborted_ratio:0.0 - prompt_length/mean:239.30096435546875 - prompt_length/max:401.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.212549775838852e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.284375861287117) - timing_s/agent_loop/generate_sequences/max:np.float64(35.68335558939725) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.949826240150287) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.68335558939725) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.6490021282807 - timing_s/reward:0.00026027020066976547 - timing_s/old_log_prob:12.404737642034888 - timing_s/ref:27.771143464371562 - timing_s/adv:0.08535679709166288 - timing_s/update_actor:24.437920166179538 - timing_s/update_weights:35.016342306509614 - timing_s/step:137.78053693939 - timing_s/stop_profile:6.423797458410263e-05 - timing_per_token_ms/adv:5.762243004286935e-05 - timing_per_token_ms/update_actor:0.016497483424274925 - timing_per_token_ms/gen:0.02931873000844207 - timing_per_token_ms/ref:0.018747666571506585 - perf/total_num_tokens:1961990 - perf/time_per_step:137.78053693939 - perf/throughput:3559.991207000239 - frontier/active_count:6.0 - frontier/completed_count:58.0 - frontier/blacklisted_count:2244.0 - frontier/mean_score:2.6348584275263587 - frontier/mean_frontier_pct:0.8672193184556093 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:64.0 - frontier/replay_pool_size:6251.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:224.0 - frontier/cluster_5/score:1.757762929735876 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:249.0 - frontier/cluster_12/score:0.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:298.0 - frontier/cluster_25/score:0.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:176.0 - frontier/cluster_26/score:2.7917669174542072 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:2.795208660878207 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:226.0 - frontier/cluster_30/score:0.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:246.0 - frontier/cluster_39/score:0.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:272.0 - frontier/cluster_48/score:2.726797471050334 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:192.0 - frontier/cluster_53/score:2.861144525632568 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:192.0 - frontier/cluster_57/score:2.876470060406958 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:295.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.1111864247538901 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.0 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.0 - cluster/prob_snapshot/cluster_26:0.17659183559216604 - cluster/prob_snapshot/cluster_27:0.17680954137020988 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.0 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.1724822253929274 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.18098028188423074 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.18194969100657574 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  37%|███▋      | 296/800 [9:53:09<17:27:06, 124.65s/it]
[36m(TaskRunner pid=2823680)[0m step:296 - global_seqlen/min:303986 - global_seqlen/max:472143 - global_seqlen/minmax_diff:168157 - global_seqlen/balanced_min:388828 - global_seqlen/balanced_max:388909 - global_seqlen/mean:388866.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.157553705580843) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009949782863259315 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.003601984979468398) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008994607607443564) - actor/ppo_kl:np.float64(-0.00017022686680808383) - actor/pg_clipfrac_lower:np.float64(6.198897858666896e-05) - actor/grad_norm:np.float64(0.2948544919490814) - perf/mfu/actor:np.float64(0.23020686380475108) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(114.0242805480957) - actor/lr:np.float64(1e-06) - training/global_step:296 - training/epoch:0 - critic/score/mean:0.671875 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6772525906562805 - critic/rewards/max:1.1582190990447998 - critic/rewards/min:-0.05939880385994911 - critic/advantages/mean:-0.04704226553440094 - critic/advantages/max:2.4748048782348633 - critic/advantages/min:-2.4748029708862305 - critic/returns/mean:-0.04704226553440094 - critic/returns/max:2.4748048782348633 - critic/returns/min:-2.4748029708862305 - response_length/mean:1073.0806884765625 - response_length/max:8192.0 - response_length/min:169.0 - response_length/clip_ratio:0.015625 - response_length_non_aborted/mean:1073.0806884765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:169.0 - response_length_non_aborted/clip_ratio:0.015625 - response/aborted_ratio:0.0 - prompt_length/mean:237.7708282470703 - prompt_length/max:382.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010262429714202881 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3166192257776856) - timing_s/agent_loop/generate_sequences/max:np.float64(29.769153871573508) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.764904892487721) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(29.769153871573508) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:31.954405361786485 - timing_s/reward:0.00016904529184103012 - timing_s/old_log_prob:9.604409845545888 - timing_s/ref:18.649242990650237 - timing_s/adv:0.08931294456124306 - timing_s/update_actor:20.00424148980528 - timing_s/update_weights:26.49582076817751 - timing_s/step:107.2365956697613 - timing_s/stop_profile:5.792267620563507e-05 - timing_per_token_ms/adv:8.871553415424835e-05 - timing_per_token_ms/update_actor:0.01987043398733457 - timing_per_token_ms/gen:0.03877368917105695 - timing_per_token_ms/ref:0.018524499014288022 - perf/total_num_tokens:1555466 - perf/time_per_step:107.2365956697613 - perf/throughput:3626.248087896481 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:32.0 - frontier/mean_score:2.09375 - frontier/mean_frontier_pct:0.006010469059359773 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.3 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.3 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.3 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.9 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.3 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.0 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.3 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:296.0 - cluster/prob_snapshot/cluster_0:0.014925373134328358 - cluster/prob_snapshot/cluster_1:0.014925373134328358 - cluster/prob_snapshot/cluster_2:0.014925373134328358 - cluster/prob_snapshot/cluster_3:0.017164179104477612 - cluster/prob_snapshot/cluster_4:0.017164179104477612 - cluster/prob_snapshot/cluster_5:0.014925373134328358 - cluster/prob_snapshot/cluster_6:0.017164179104477612 - cluster/prob_snapshot/cluster_7:0.014925373134328358 - cluster/prob_snapshot/cluster_8:0.017164179104477612 - cluster/prob_snapshot/cluster_9:0.014925373134328358 - cluster/prob_snapshot/cluster_10:0.014925373134328358 - cluster/prob_snapshot/cluster_11:0.014925373134328358 - cluster/prob_snapshot/cluster_12:0.014925373134328358 - cluster/prob_snapshot/cluster_13:0.014925373134328358 - cluster/prob_snapshot/cluster_14:0.014925373134328358 - cluster/prob_snapshot/cluster_15:0.014925373134328358 - cluster/prob_snapshot/cluster_16:0.014925373134328358 - cluster/prob_snapshot/cluster_17:0.014925373134328358 - cluster/prob_snapshot/cluster_18:0.014925373134328358 - cluster/prob_snapshot/cluster_19:0.014925373134328358 - cluster/prob_snapshot/cluster_20:0.017164179104477612 - cluster/prob_snapshot/cluster_21:0.014925373134328358 - cluster/prob_snapshot/cluster_22:0.014925373134328358 - cluster/prob_snapshot/cluster_23:0.017164179104477612 - cluster/prob_snapshot/cluster_24:0.014925373134328358 - cluster/prob_snapshot/cluster_25:0.012686567164179104 - cluster/prob_snapshot/cluster_26:0.014925373134328358 - cluster/prob_snapshot/cluster_27:0.017164179104477612 - cluster/prob_snapshot/cluster_28:0.014925373134328358 - cluster/prob_snapshot/cluster_29:0.014925373134328358 - cluster/prob_snapshot/cluster_30:0.014925373134328358 - cluster/prob_snapshot/cluster_31:0.014925373134328358 - cluster/prob_snapshot/cluster_32:0.014925373134328358 - cluster/prob_snapshot/cluster_33:0.014925373134328358 - cluster/prob_snapshot/cluster_34:0.014925373134328358 - cluster/prob_snapshot/cluster_35:0.014925373134328358 - cluster/prob_snapshot/cluster_36:0.014925373134328358 - cluster/prob_snapshot/cluster_37:0.014925373134328358 - cluster/prob_snapshot/cluster_38:0.021641791044776117 - cluster/prob_snapshot/cluster_39:0.014925373134328358 - cluster/prob_snapshot/cluster_40:0.021641791044776117 - cluster/prob_snapshot/cluster_41:0.014925373134328358 - cluster/prob_snapshot/cluster_42:0.014925373134328358 - cluster/prob_snapshot/cluster_43:0.021641791044776117 - cluster/prob_snapshot/cluster_44:0.017164179104477612 - cluster/prob_snapshot/cluster_45:0.014925373134328358 - cluster/prob_snapshot/cluster_46:0.017164179104477612 - cluster/prob_snapshot/cluster_47:0.014925373134328358 - cluster/prob_snapshot/cluster_48:0.017164179104477612 - cluster/prob_snapshot/cluster_49:0.014925373134328358 - cluster/prob_snapshot/cluster_50:0.014925373134328358 - cluster/prob_snapshot/cluster_51:0.014925373134328358 - cluster/prob_snapshot/cluster_52:0.014925373134328358 - cluster/prob_snapshot/cluster_53:0.014925373134328358 - cluster/prob_snapshot/cluster_54:0.014925373134328358 - cluster/prob_snapshot/cluster_55:0.014925373134328358 - cluster/prob_snapshot/cluster_56:0.014925373134328358 - cluster/prob_snapshot/cluster_57:0.014925373134328358 - cluster/prob_snapshot/cluster_58:0.017164179104477612 - cluster/prob_snapshot/cluster_59:0.014925373134328358 - cluster/prob_snapshot/cluster_60:0.014925373134328358 - cluster/prob_snapshot/cluster_61:0.014925373134328358 - cluster/prob_snapshot/cluster_62:0.014925373134328358 - cluster/prob_snapshot/cluster_63:0.017164179104477612
[36m(TaskRunner pid=2823680)[0m Training Progress:  37%|███▋      | 297/800 [9:55:16<17:31:25, 125.42s/it]
[36m(TaskRunner pid=2823680)[0m step:297 - global_seqlen/min:334285 - global_seqlen/max:496194 - global_seqlen/minmax_diff:161909 - global_seqlen/balanced_min:405532 - global_seqlen/balanced_max:405668 - global_seqlen/mean:405589.0 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.16467634855987542) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008369647897779942 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.10914537205826491) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0016608420081227334) - actor/ppo_kl:np.float64(-0.0013857098345182262) - actor/pg_clipfrac_lower:np.float64(0.0002926528576357127) - actor/grad_norm:np.float64(0.329665915419658) - perf/mfu/actor:np.float64(0.22266244370566896) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(105.26473999023438) - actor/lr:np.float64(1e-06) - training/global_step:297 - training/epoch:0 - critic/score/mean:0.5967742204666138 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6033810973167419 - critic/rewards/max:1.1325515508651733 - critic/rewards/min:-0.0709695890545845 - critic/advantages/mean:-0.061608340591192245 - critic/advantages/max:2.4747965335845947 - critic/advantages/min:-2.474766492843628 - critic/returns/mean:-0.061608340591192245 - critic/returns/max:2.4747965335845947 - critic/returns/min:-2.474766492843628 - response_length/mean:1211.6209716796875 - response_length/max:8192.0 - response_length/min:186.0 - response_length/clip_ratio:0.022849462926387787 - response_length_non_aborted/mean:1211.6209716796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:186.0 - response_length_non_aborted/clip_ratio:0.022849462926387787 - response/aborted_ratio:0.0 - prompt_length/mean:239.19354248046875 - prompt_length/max:382.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.122956544160843e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4995153434574604) - timing_s/agent_loop/generate_sequences/max:np.float64(30.43036311212927) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.137409036036843) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(30.43036311212927) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:32.962059089913964 - timing_s/reward:0.0002733040601015091 - timing_s/old_log_prob:10.010607905685902 - timing_s/ref:21.768455609679222 - timing_s/adv:0.0630170376971364 - timing_s/update_actor:21.601138290949166 - timing_s/update_weights:40.1398467188701 - timing_s/step:126.95811547990888 - timing_s/stop_profile:8.873920887708664e-05 - timing_per_token_ms/adv:5.8381218649087e-05 - timing_per_token_ms/update_actor:0.02001206060643462 - timing_per_token_ms/gen:0.03656576111038705 - timing_per_token_ms/ref:0.02016706930448712 - perf/total_num_tokens:1622356 - perf/time_per_step:126.95811547990888 - perf/throughput:3194.667772649669 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:67.0 - frontier/mean_score:2.1401562499999995 - frontier/mean_frontier_pct:0.019995082839975236 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.3 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:1.91 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.51 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:16.0 - frontier/cluster_25/score:1.7 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.3 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.9299999999999997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.51 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.51 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:297.0 - cluster/prob_snapshot/cluster_0:0.01460173760677521 - cluster/prob_snapshot/cluster_1:0.01460173760677521 - cluster/prob_snapshot/cluster_2:0.01460173760677521 - cluster/prob_snapshot/cluster_3:0.01679199824779149 - cluster/prob_snapshot/cluster_4:0.01679199824779149 - cluster/prob_snapshot/cluster_5:0.01679199824779149 - cluster/prob_snapshot/cluster_6:0.01679199824779149 - cluster/prob_snapshot/cluster_7:0.01460173760677521 - cluster/prob_snapshot/cluster_8:0.013944659414470324 - cluster/prob_snapshot/cluster_9:0.01460173760677521 - cluster/prob_snapshot/cluster_10:0.01460173760677521 - cluster/prob_snapshot/cluster_11:0.01679199824779149 - cluster/prob_snapshot/cluster_12:0.01460173760677521 - cluster/prob_snapshot/cluster_13:0.01460173760677521 - cluster/prob_snapshot/cluster_14:0.01460173760677521 - cluster/prob_snapshot/cluster_15:0.01460173760677521 - cluster/prob_snapshot/cluster_16:0.01460173760677521 - cluster/prob_snapshot/cluster_17:0.01460173760677521 - cluster/prob_snapshot/cluster_18:0.01679199824779149 - cluster/prob_snapshot/cluster_19:0.01460173760677521 - cluster/prob_snapshot/cluster_20:0.018325180696502887 - cluster/prob_snapshot/cluster_21:0.01460173760677521 - cluster/prob_snapshot/cluster_22:0.01460173760677521 - cluster/prob_snapshot/cluster_23:0.01679199824779149 - cluster/prob_snapshot/cluster_24:0.021172519529824053 - cluster/prob_snapshot/cluster_25:0.012411476965758927 - cluster/prob_snapshot/cluster_26:0.01460173760677521 - cluster/prob_snapshot/cluster_27:0.01679199824779149 - cluster/prob_snapshot/cluster_28:0.01460173760677521 - cluster/prob_snapshot/cluster_29:0.012411476965758927 - cluster/prob_snapshot/cluster_30:0.01460173760677521 - cluster/prob_snapshot/cluster_31:0.01460173760677521 - cluster/prob_snapshot/cluster_32:0.01679199824779149 - cluster/prob_snapshot/cluster_33:0.01460173760677521 - cluster/prob_snapshot/cluster_34:0.01460173760677521 - cluster/prob_snapshot/cluster_35:0.01460173760677521 - cluster/prob_snapshot/cluster_36:0.01679199824779149 - cluster/prob_snapshot/cluster_37:0.01460173760677521 - cluster/prob_snapshot/cluster_38:0.021172519529824053 - cluster/prob_snapshot/cluster_39:0.01460173760677521 - cluster/prob_snapshot/cluster_40:0.021172519529824053 - cluster/prob_snapshot/cluster_41:0.01460173760677521 - cluster/prob_snapshot/cluster_42:0.01460173760677521 - cluster/prob_snapshot/cluster_43:0.02139154559392568 - cluster/prob_snapshot/cluster_44:0.018325180696502887 - cluster/prob_snapshot/cluster_45:0.01460173760677521 - cluster/prob_snapshot/cluster_46:0.01679199824779149 - cluster/prob_snapshot/cluster_47:0.01460173760677521 - cluster/prob_snapshot/cluster_48:0.01679199824779149 - cluster/prob_snapshot/cluster_49:0.01460173760677521 - cluster/prob_snapshot/cluster_50:0.012411476965758927 - cluster/prob_snapshot/cluster_51:0.01460173760677521 - cluster/prob_snapshot/cluster_52:0.01460173760677521 - cluster/prob_snapshot/cluster_53:0.01460173760677521 - cluster/prob_snapshot/cluster_54:0.01460173760677521 - cluster/prob_snapshot/cluster_55:0.01679199824779149 - cluster/prob_snapshot/cluster_56:0.01679199824779149 - cluster/prob_snapshot/cluster_57:0.01679199824779149 - cluster/prob_snapshot/cluster_58:0.01679199824779149 - cluster/prob_snapshot/cluster_59:0.01460173760677521 - cluster/prob_snapshot/cluster_60:0.01460173760677521 - cluster/prob_snapshot/cluster_61:0.01460173760677521 - cluster/prob_snapshot/cluster_62:0.01460173760677521 - cluster/prob_snapshot/cluster_63:0.018325180696502887
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 21:27:58,620:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  37%|███▋      | 298/800 [9:57:21<17:29:14, 125.41s/it]
[36m(TaskRunner pid=2823680)[0m step:298 - global_seqlen/min:366695 - global_seqlen/max:560291 - global_seqlen/minmax_diff:193596 - global_seqlen/balanced_min:460235 - global_seqlen/balanced_max:460614 - global_seqlen/mean:460463.0 - frontier/skipped_zero_acc_count:25.0 - actor/entropy:np.float64(0.15509145312870926) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.006130635738372803 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.08437670476268977) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.002290321149792782) - actor/ppo_kl:np.float64(0.0005884827664093433) - actor/pg_clipfrac_lower:np.float64(0.0003425367717332287) - actor/grad_norm:np.float64(0.29345450836878556) - perf/mfu/actor:np.float64(0.19594384879370888) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.16441345214844) - actor/lr:np.float64(1e-06) - training/global_step:298 - training/epoch:0 - critic/score/mean:0.5655339956283569 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5944932103157043 - critic/rewards/max:1.655890703201294 - critic/rewards/min:-0.08332731574773788 - critic/advantages/mean:0.009395345114171505 - critic/advantages/max:2.474717855453491 - critic/advantages/min:-2.4748520851135254 - critic/returns/mean:0.009395345114171505 - critic/returns/max:2.474717855453491 - critic/returns/min:-2.4748520851135254 - response_length/mean:1459.666259765625 - response_length/max:8192.0 - response_length/min:87.0 - response_length/clip_ratio:0.05461164936423302 - response_length_non_aborted/mean:1459.666259765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:87.0 - response_length_non_aborted/clip_ratio:0.05461164936423302 - response/aborted_ratio:0.0 - prompt_length/mean:240.13592529296875 - prompt_length/max:377.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.447429329156876e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.7339721582829952) - timing_s/agent_loop/generate_sequences/max:np.float64(36.827799937687814) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.291375530968253) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(36.827799937687814) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:39.11077919322997 - timing_s/reward:0.00014532450586557388 - timing_s/old_log_prob:11.55723191704601 - timing_s/ref:18.24034873675555 - timing_s/adv:0.07274115644395351 - timing_s/update_actor:28.174163836054504 - timing_s/update_weights:27.54136395920068 - timing_s/step:125.15682689659297 - timing_s/stop_profile:5.2102841436862946e-05 - timing_per_token_ms/adv:5.1934338764400424e-05 - timing_per_token_ms/update_actor:0.020115250301151907 - timing_per_token_ms/gen:0.032517390507064946 - timing_per_token_ms/ref:0.013022895108979379 - perf/total_num_tokens:1841852 - perf/time_per_step:125.15682689659297 - perf/throughput:3679.088160172386 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:92.0 - frontier/mean_score:2.19959375 - frontier/mean_frontier_pct:0.02804938725964098 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.3 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.3 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.3 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.237 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.51 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.3 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.3 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.3 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.0 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.9299999999999997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.51 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.51 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.0 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.51 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:16.0 - frontier/cluster_63/score:2.51 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:298.0 - cluster/prob_snapshot/cluster_0:0.01633824427806271 - cluster/prob_snapshot/cluster_1:0.01633824427806271 - cluster/prob_snapshot/cluster_2:0.014207168937445835 - cluster/prob_snapshot/cluster_3:0.01633824427806271 - cluster/prob_snapshot/cluster_4:0.01633824427806271 - cluster/prob_snapshot/cluster_5:0.01633824427806271 - cluster/prob_snapshot/cluster_6:0.01633824427806271 - cluster/prob_snapshot/cluster_7:0.014207168937445835 - cluster/prob_snapshot/cluster_8:0.015890718456533167 - cluster/prob_snapshot/cluster_9:0.014207168937445835 - cluster/prob_snapshot/cluster_10:0.014207168937445835 - cluster/prob_snapshot/cluster_11:0.01633824427806271 - cluster/prob_snapshot/cluster_12:0.014207168937445835 - cluster/prob_snapshot/cluster_13:0.014207168937445835 - cluster/prob_snapshot/cluster_14:0.014207168937445835 - cluster/prob_snapshot/cluster_15:0.01633824427806271 - cluster/prob_snapshot/cluster_16:0.014207168937445835 - cluster/prob_snapshot/cluster_17:0.014207168937445835 - cluster/prob_snapshot/cluster_18:0.01633824427806271 - cluster/prob_snapshot/cluster_19:0.014207168937445835 - cluster/prob_snapshot/cluster_20:0.01887422393339679 - cluster/prob_snapshot/cluster_21:0.014207168937445835 - cluster/prob_snapshot/cluster_22:0.014207168937445835 - cluster/prob_snapshot/cluster_23:0.01782999701649452 - cluster/prob_snapshot/cluster_24:0.02060039495929646 - cluster/prob_snapshot/cluster_25:0.010584340858397148 - cluster/prob_snapshot/cluster_26:0.01633824427806271 - cluster/prob_snapshot/cluster_27:0.022092147697728274 - cluster/prob_snapshot/cluster_28:0.014207168937445835 - cluster/prob_snapshot/cluster_29:0.012076093596828959 - cluster/prob_snapshot/cluster_30:0.012076093596828959 - cluster/prob_snapshot/cluster_31:0.014207168937445835 - cluster/prob_snapshot/cluster_32:0.01633824427806271 - cluster/prob_snapshot/cluster_33:0.014207168937445835 - cluster/prob_snapshot/cluster_34:0.014207168937445835 - cluster/prob_snapshot/cluster_35:0.01633824427806271 - cluster/prob_snapshot/cluster_36:0.01633824427806271 - cluster/prob_snapshot/cluster_37:0.014207168937445835 - cluster/prob_snapshot/cluster_38:0.02060039495929646 - cluster/prob_snapshot/cluster_39:0.014207168937445835 - cluster/prob_snapshot/cluster_40:0.02060039495929646 - cluster/prob_snapshot/cluster_41:0.01633824427806271 - cluster/prob_snapshot/cluster_42:0.014207168937445835 - cluster/prob_snapshot/cluster_43:0.020813502493358147 - cluster/prob_snapshot/cluster_44:0.01782999701649452 - cluster/prob_snapshot/cluster_45:0.014207168937445835 - cluster/prob_snapshot/cluster_46:0.01782999701649452 - cluster/prob_snapshot/cluster_47:0.01633824427806271 - cluster/prob_snapshot/cluster_48:0.01633824427806271 - cluster/prob_snapshot/cluster_49:0.014207168937445835 - cluster/prob_snapshot/cluster_50:0.012076093596828959 - cluster/prob_snapshot/cluster_51:0.014207168937445835 - cluster/prob_snapshot/cluster_52:0.01633824427806271 - cluster/prob_snapshot/cluster_53:0.014207168937445835 - cluster/prob_snapshot/cluster_54:0.014207168937445835 - cluster/prob_snapshot/cluster_55:0.01782999701649452 - cluster/prob_snapshot/cluster_56:0.01633824427806271 - cluster/prob_snapshot/cluster_57:0.01633824427806271 - cluster/prob_snapshot/cluster_58:0.01633824427806271 - cluster/prob_snapshot/cluster_59:0.014207168937445835 - cluster/prob_snapshot/cluster_60:0.014207168937445835 - cluster/prob_snapshot/cluster_61:0.014207168937445835 - cluster/prob_snapshot/cluster_62:0.014207168937445835 - cluster/prob_snapshot/cluster_63:0.01782999701649452
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 21:29:44,871:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  37%|███▋      | 299/800 [9:59:21<17:13:11, 123.74s/it]
[36m(TaskRunner pid=2823680)[0m step:299 - global_seqlen/min:400472 - global_seqlen/max:497154 - global_seqlen/minmax_diff:96682 - global_seqlen/balanced_min:454733 - global_seqlen/balanced_max:455082 - global_seqlen/mean:454962.25 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.14495531082567242) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0064998893067240715 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.04607241530902684) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000696419645181676) - actor/ppo_kl:np.float64(-5.29498138440153e-05) - actor/pg_clipfrac_lower:np.float64(4.318361950734268e-06) - actor/grad_norm:np.float64(0.40537921090920764) - perf/mfu/actor:np.float64(0.24984538273497403) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.38745021820068) - actor/lr:np.float64(1e-06) - training/global_step:299 - training/epoch:0 - critic/score/mean:0.6347222328186035 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6482223272323608 - critic/rewards/max:1.3848367929458618 - critic/rewards/min:-0.38096684217453003 - critic/advantages/mean:-0.08792375028133392 - critic/advantages/max:2.474679708480835 - critic/advantages/min:-2.474731683731079 - critic/returns/mean:-0.08792375028133392 - critic/returns/max:2.474679708480835 - critic/returns/min:-2.474731683731079 - response_length/mean:1298.6097412109375 - response_length/max:8192.0 - response_length/min:205.0 - response_length/clip_ratio:0.02916666679084301 - response_length_non_aborted/mean:1298.6097412109375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:205.0 - response_length_non_aborted/clip_ratio:0.02916666679084301 - response/aborted_ratio:0.0 - prompt_length/mean:240.65554809570312 - prompt_length/max:401.0 - prompt_length/min:187.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.109079837799072e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6792177725583315) - timing_s/agent_loop/generate_sequences/max:np.float64(32.77437113318592) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.299574542351365) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.77437113318592) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:206 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.62939813826233 - timing_s/reward:0.00024482980370521545 - timing_s/old_log_prob:10.281972754746675 - timing_s/ref:22.097610915079713 - timing_s/adv:0.07969061192125082 - timing_s/update_actor:21.763159026391804 - timing_s/update_weights:29.31681620143354 - timing_s/step:119.5969461305067 - timing_s/stop_profile:6.481073796749115e-05 - timing_per_token_ms/adv:7.190534798912072e-05 - timing_per_token_ms/update_actor:0.019637037354935574 - timing_per_token_ms/gen:0.03810634892471792 - timing_per_token_ms/ref:0.019938815429691576 - perf/total_num_tokens:1819849 - perf/time_per_step:119.5969461305067 - perf/throughput:3804.1293253720346 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:130.0 - frontier/mean_score:2.23128125 - frontier/mean_frontier_pct:0.037998347243051414 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.3 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.3 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.3 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.51 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:16.0 - frontier/cluster_8/score:2.237 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.3 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.51 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.3 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.9299999999999997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:2.6569999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:0.0 - frontier/cluster_53/score:2.3 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.51 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:2.0569999999999995 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:299.0 - cluster/prob_snapshot/cluster_0:0.016106216999761908 - cluster/prob_snapshot/cluster_1:0.016106216999761908 - cluster/prob_snapshot/cluster_2:0.011904595173737063 - cluster/prob_snapshot/cluster_3:0.016106216999761908 - cluster/prob_snapshot/cluster_4:0.017576784638870604 - cluster/prob_snapshot/cluster_5:0.016106216999761908 - cluster/prob_snapshot/cluster_6:0.016106216999761908 - cluster/prob_snapshot/cluster_7:0.014005406086749486 - cluster/prob_snapshot/cluster_8:0.0156650467080293 - cluster/prob_snapshot/cluster_9:0.014005406086749486 - cluster/prob_snapshot/cluster_10:0.014005406086749486 - cluster/prob_snapshot/cluster_11:0.016106216999761908 - cluster/prob_snapshot/cluster_12:0.014005406086749486 - cluster/prob_snapshot/cluster_13:0.014005406086749486 - cluster/prob_snapshot/cluster_14:0.014005406086749486 - cluster/prob_snapshot/cluster_15:0.016106216999761908 - cluster/prob_snapshot/cluster_16:0.014005406086749486 - cluster/prob_snapshot/cluster_17:0.016106216999761908 - cluster/prob_snapshot/cluster_18:0.017576784638870604 - cluster/prob_snapshot/cluster_19:0.016106216999761908 - cluster/prob_snapshot/cluster_20:0.018606181986246688 - cluster/prob_snapshot/cluster_21:0.014005406086749486 - cluster/prob_snapshot/cluster_22:0.014005406086749486 - cluster/prob_snapshot/cluster_23:0.018606181986246688 - cluster/prob_snapshot/cluster_24:0.020307838825786753 - cluster/prob_snapshot/cluster_25:0.010434027534628368 - cluster/prob_snapshot/cluster_26:0.013375162812845759 - cluster/prob_snapshot/cluster_27:0.02177840646489545 - cluster/prob_snapshot/cluster_28:0.014005406086749486 - cluster/prob_snapshot/cluster_29:0.011904595173737063 - cluster/prob_snapshot/cluster_30:0.011904595173737063 - cluster/prob_snapshot/cluster_31:0.014005406086749486 - cluster/prob_snapshot/cluster_32:0.016106216999761908 - cluster/prob_snapshot/cluster_33:0.014005406086749486 - cluster/prob_snapshot/cluster_34:0.016106216999761908 - cluster/prob_snapshot/cluster_35:0.017576784638870604 - cluster/prob_snapshot/cluster_36:0.016106216999761908 - cluster/prob_snapshot/cluster_37:0.014005406086749486 - cluster/prob_snapshot/cluster_38:0.020307838825786753 - cluster/prob_snapshot/cluster_39:0.016106216999761908 - cluster/prob_snapshot/cluster_40:0.020307838825786753 - cluster/prob_snapshot/cluster_41:0.016106216999761908 - cluster/prob_snapshot/cluster_42:0.016106216999761908 - cluster/prob_snapshot/cluster_43:0.020517919917087995 - cluster/prob_snapshot/cluster_44:0.018606181986246688 - cluster/prob_snapshot/cluster_45:0.014005406086749486 - cluster/prob_snapshot/cluster_46:0.018606181986246688 - cluster/prob_snapshot/cluster_47:0.016106216999761908 - cluster/prob_snapshot/cluster_48:0.016106216999761908 - cluster/prob_snapshot/cluster_49:0.014005406086749486 - cluster/prob_snapshot/cluster_50:0.011904595173737063 - cluster/prob_snapshot/cluster_51:0.014005406086749486 - cluster/prob_snapshot/cluster_52:0.016106216999761908 - cluster/prob_snapshot/cluster_53:0.016106216999761908 - cluster/prob_snapshot/cluster_54:0.014005406086749486 - cluster/prob_snapshot/cluster_55:0.017576784638870604 - cluster/prob_snapshot/cluster_56:0.016106216999761908 - cluster/prob_snapshot/cluster_57:0.016106216999761908 - cluster/prob_snapshot/cluster_58:0.016106216999761908 - cluster/prob_snapshot/cluster_59:0.016106216999761908 - cluster/prob_snapshot/cluster_60:0.014005406086749486 - cluster/prob_snapshot/cluster_61:0.014005406086749486 - cluster/prob_snapshot/cluster_62:0.014005406086749486 - cluster/prob_snapshot/cluster_63:0.014404560160221843
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 21:31:45,902:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_300
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 300}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 300/800 [10:04:51<25:46:22, 185.56s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:300 - global_seqlen/min:406111 - global_seqlen/max:565497 - global_seqlen/minmax_diff:159386 - global_seqlen/balanced_min:460653 - global_seqlen/balanced_max:460762 - global_seqlen/mean:460727.25 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.1483314616263521) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009182619862258434 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.038771294017351465) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007065027388381744) - actor/ppo_kl:np.float64(0.00014094056996443715) - actor/pg_clipfrac_lower:np.float64(6.671007770132653e-06) - actor/grad_norm:np.float64(0.3629543139384343) - perf/mfu/actor:np.float64(0.21622663626668268) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.39189147949219) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.10625) - val-aux/aime2024/reward/std@16:np.float64(0.11995231623840655) - val-aux/aime2024/reward/best@2/mean:np.float64(0.15439999999999998) - val-aux/aime2024/reward/best@2/std:np.float64(0.12038961583920292) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.058966666666666674) - val-aux/aime2024/reward/worst@2/std:np.float64(0.08147396479918227) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.10593333333333335) - val-aux/aime2024/reward/maj@2/std:np.float64(0.12090195249576438) - val-aux/aime2024/reward/best@4/mean:np.float64(0.1993666666666667) - val-aux/aime2024/reward/best@4/std:np.float64(0.10711996573421931) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.028333333333333332) - val-aux/aime2024/reward/worst@4/std:np.float64(0.049456223470028475) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.12813333333333335) - val-aux/aime2024/reward/maj@4/std:np.float64(0.10141365662610323) - val-aux/aime2024/reward/best@8/mean:np.float64(0.24096666666666663) - val-aux/aime2024/reward/best@8/std:np.float64(0.0914727605876013) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.009633333333333334) - val-aux/aime2024/reward/worst@8/std:np.float64(0.024503733638015408) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.14773333333333333) - val-aux/aime2024/reward/maj@8/std:np.float64(0.07385193784122496) - val-aux/aime2024/reward/best@16/mean:np.float64(0.28150000000000003) - val-aux/aime2024/reward/best@16/std:np.float64(0.07567199401599482) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.0011333333333333334) - val-aux/aime2024/reward/worst@16/std:np.float64(0.008181865643308667) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.16226666666666664) - val-aux/aime2024/reward/maj@16/std:np.float64(0.0404475297877076) - val-aux/aime2024/score/mean@16:np.float64(0.10625) - val-aux/aime2024/score/std@16:np.float64(0.11995231623840655) - val-aux/aime2024/score/best@2/mean:np.float64(0.15439999999999998) - val-aux/aime2024/score/best@2/std:np.float64(0.12038961583920292) - val-aux/aime2024/score/worst@2/mean:np.float64(0.058966666666666674) - val-aux/aime2024/score/worst@2/std:np.float64(0.08147396479918227) - val-aux/aime2024/score/maj@2/mean:np.float64(0.10593333333333335) - val-aux/aime2024/score/maj@2/std:np.float64(0.12090195249576438) - val-aux/aime2024/score/best@4/mean:np.float64(0.1993666666666667) - val-aux/aime2024/score/best@4/std:np.float64(0.10711996573421931) - val-aux/aime2024/score/worst@4/mean:np.float64(0.028333333333333332) - val-aux/aime2024/score/worst@4/std:np.float64(0.049456223470028475) - val-aux/aime2024/score/maj@4/mean:np.float64(0.12813333333333335) - val-aux/aime2024/score/maj@4/std:np.float64(0.10141365662610323) - val-aux/aime2024/score/best@8/mean:np.float64(0.24096666666666663) - val-aux/aime2024/score/best@8/std:np.float64(0.0914727605876013) - val-aux/aime2024/score/worst@8/mean:np.float64(0.009633333333333334) - val-aux/aime2024/score/worst@8/std:np.float64(0.024503733638015408) - val-aux/aime2024/score/maj@8/mean:np.float64(0.14773333333333333) - val-aux/aime2024/score/maj@8/std:np.float64(0.07385193784122496) - val-aux/aime2024/score/best@16/mean:np.float64(0.28150000000000003) - val-aux/aime2024/score/best@16/std:np.float64(0.07567199401599482) - val-aux/aime2024/score/worst@16/mean:np.float64(0.0011333333333333334) - val-aux/aime2024/score/worst@16/std:np.float64(0.008181865643308667) - val-aux/aime2024/score/maj@16/mean:np.float64(0.16226666666666664) - val-aux/aime2024/score/maj@16/std:np.float64(0.0404475297877076) - val-core/aime2024/acc/mean@16:np.float64(0.10625) - val-aux/aime2024/acc/std@16:np.float64(0.11995231623840655) - val-aux/aime2024/acc/best@2/mean:np.float64(0.15439999999999998) - val-aux/aime2024/acc/best@2/std:np.float64(0.12038961583920292) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.058966666666666674) - val-aux/aime2024/acc/worst@2/std:np.float64(0.08147396479918227) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.10593333333333335) - val-aux/aime2024/acc/maj@2/std:np.float64(0.12090195249576438) - val-aux/aime2024/acc/best@4/mean:np.float64(0.1993666666666667) - val-aux/aime2024/acc/best@4/std:np.float64(0.10711996573421931) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.028333333333333332) - val-aux/aime2024/acc/worst@4/std:np.float64(0.049456223470028475) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.12813333333333335) - val-aux/aime2024/acc/maj@4/std:np.float64(0.10141365662610323) - val-aux/aime2024/acc/best@8/mean:np.float64(0.24096666666666663) - val-aux/aime2024/acc/best@8/std:np.float64(0.0914727605876013) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.009633333333333334) - val-aux/aime2024/acc/worst@8/std:np.float64(0.024503733638015408) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.14773333333333333) - val-aux/aime2024/acc/maj@8/std:np.float64(0.07385193784122496) - val-core/aime2024/acc/best@16/mean:np.float64(0.28150000000000003) - val-core/aime2024/acc/best@16/std:np.float64(0.07567199401599482) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.0011333333333333334) - val-aux/aime2024/acc/worst@16/std:np.float64(0.008181865643308667) - val-core/aime2024/acc/maj@16/mean:np.float64(0.16226666666666664) - val-core/aime2024/acc/maj@16/std:np.float64(0.0404475297877076) - val-aux/aime2025/reward/mean@16:np.float64(0.06875) - val-aux/aime2025/reward/std@16:np.float64(0.09814512269421188) - val-aux/aime2025/reward/best@2/mean:np.float64(0.11200000000000002) - val-aux/aime2025/reward/best@2/std:np.float64(0.10394133179673758) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.02733333333333333) - val-aux/aime2025/reward/worst@2/std:np.float64(0.06282640561252009) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.0701) - val-aux/aime2025/reward/maj@2/std:np.float64(0.09852758828190535) - val-aux/aime2025/reward/best@4/mean:np.float64(0.15753333333333333) - val-aux/aime2025/reward/best@4/std:np.float64(0.09299676222427188) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0061) - val-aux/aime2025/reward/worst@4/std:np.float64(0.02552661844893481) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.08203333333333333) - val-aux/aime2025/reward/maj@4/std:np.float64(0.0964395598360303) - val-aux/aime2025/reward/best@8/mean:np.float64(0.19949999999999998) - val-aux/aime2025/reward/best@8/std:np.float64(0.06554885842841997) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0004) - val-aux/aime2025/reward/worst@8/std:np.float64(0.005130211242890276) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.09369999999999999) - val-aux/aime2025/reward/maj@8/std:np.float64(0.09240404408512409) - val-aux/aime2025/reward/best@16/mean:np.float64(0.2236) - val-aux/aime2025/reward/best@16/std:np.float64(0.03322973520437668) - val-aux/aime2025/reward/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2025/reward/worst@16/std:np.float64(0.001053565375285274) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.10996666666666667) - val-aux/aime2025/reward/maj@16/std:np.float64(0.08019460163338367) - val-aux/aime2025/score/mean@16:np.float64(0.06875) - val-aux/aime2025/score/std@16:np.float64(0.09814512269421188) - val-aux/aime2025/score/best@2/mean:np.float64(0.11200000000000002) - val-aux/aime2025/score/best@2/std:np.float64(0.10394133179673758) - val-aux/aime2025/score/worst@2/mean:np.float64(0.02733333333333333) - val-aux/aime2025/score/worst@2/std:np.float64(0.06282640561252009) - val-aux/aime2025/score/maj@2/mean:np.float64(0.0701) - val-aux/aime2025/score/maj@2/std:np.float64(0.09852758828190535) - val-aux/aime2025/score/best@4/mean:np.float64(0.15753333333333333) - val-aux/aime2025/score/best@4/std:np.float64(0.09299676222427188) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0061) - val-aux/aime2025/score/worst@4/std:np.float64(0.02552661844893481) - val-aux/aime2025/score/maj@4/mean:np.float64(0.08203333333333333) - val-aux/aime2025/score/maj@4/std:np.float64(0.0964395598360303) - val-aux/aime2025/score/best@8/mean:np.float64(0.19949999999999998) - val-aux/aime2025/score/best@8/std:np.float64(0.06554885842841997) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0004) - val-aux/aime2025/score/worst@8/std:np.float64(0.005130211242890276) - val-aux/aime2025/score/maj@8/mean:np.float64(0.09369999999999999) - val-aux/aime2025/score/maj@8/std:np.float64(0.09240404408512409) - val-aux/aime2025/score/best@16/mean:np.float64(0.2236) - val-aux/aime2025/score/best@16/std:np.float64(0.03322973520437668) - val-aux/aime2025/score/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2025/score/worst@16/std:np.float64(0.001053565375285274) - val-aux/aime2025/score/maj@16/mean:np.float64(0.10996666666666667) - val-aux/aime2025/score/maj@16/std:np.float64(0.08019460163338367) - val-core/aime2025/acc/mean@16:np.float64(0.06875) - val-aux/aime2025/acc/std@16:np.float64(0.09814512269421188) - val-aux/aime2025/acc/best@2/mean:np.float64(0.11200000000000002) - val-aux/aime2025/acc/best@2/std:np.float64(0.10394133179673758) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.02733333333333333) - val-aux/aime2025/acc/worst@2/std:np.float64(0.06282640561252009) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.0701) - val-aux/aime2025/acc/maj@2/std:np.float64(0.09852758828190535) - val-aux/aime2025/acc/best@4/mean:np.float64(0.15753333333333333) - val-aux/aime2025/acc/best@4/std:np.float64(0.09299676222427188) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0061) - val-aux/aime2025/acc/worst@4/std:np.float64(0.02552661844893481) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.08203333333333333) - val-aux/aime2025/acc/maj@4/std:np.float64(0.0964395598360303) - val-aux/aime2025/acc/best@8/mean:np.float64(0.19949999999999998) - val-aux/aime2025/acc/best@8/std:np.float64(0.06554885842841997) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0004) - val-aux/aime2025/acc/worst@8/std:np.float64(0.005130211242890276) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.09369999999999999) - val-aux/aime2025/acc/maj@8/std:np.float64(0.09240404408512409) - val-core/aime2025/acc/best@16/mean:np.float64(0.2236) - val-core/aime2025/acc/best@16/std:np.float64(0.03322973520437668) - val-aux/aime2025/acc/worst@16/mean:np.float64(3.3333333333333335e-05) - val-aux/aime2025/acc/worst@16/std:np.float64(0.001053565375285274) - val-core/aime2025/acc/maj@16/mean:np.float64(0.10996666666666667) - val-core/aime2025/acc/maj@16/std:np.float64(0.08019460163338367) - val-aux/math500/reward/mean@4:np.float64(0.721) - val-aux/math500/reward/std@4:np.float64(0.11640638795573721) - val-aux/math500/reward/best@2/mean:np.float64(0.773032) - val-aux/math500/reward/best@2/std:np.float64(0.09596304830570421) - val-aux/math500/reward/worst@2/mean:np.float64(0.668676) - val-aux/math500/reward/worst@2/std:np.float64(0.10389300033190978) - val-aux/math500/reward/maj@2/mean:np.float64(0.72072) - val-aux/math500/reward/maj@2/std:np.float64(0.11624054247674342) - val-aux/math500/reward/best@4/mean:np.float64(0.811558) - val-aux/math500/reward/best@4/std:np.float64(0.06006813497870701) - val-aux/math500/reward/worst@4/mean:np.float64(0.624584) - val-aux/math500/reward/worst@4/std:np.float64(0.07327313836660543) - val-aux/math500/reward/maj@4/mean:np.float64(0.7319580000000001) - val-aux/math500/reward/maj@4/std:np.float64(0.10589028816598083) - val-aux/math500/score/mean@4:np.float64(0.721) - val-aux/math500/score/std@4:np.float64(0.11640638795573721) - val-aux/math500/score/best@2/mean:np.float64(0.773032) - val-aux/math500/score/best@2/std:np.float64(0.09596304830570421) - val-aux/math500/score/worst@2/mean:np.float64(0.668676) - val-aux/math500/score/worst@2/std:np.float64(0.10389300033190978) - val-aux/math500/score/maj@2/mean:np.float64(0.72072) - val-aux/math500/score/maj@2/std:np.float64(0.11624054247674342) - val-aux/math500/score/best@4/mean:np.float64(0.811558) - val-aux/math500/score/best@4/std:np.float64(0.06006813497870701) - val-aux/math500/score/worst@4/mean:np.float64(0.624584) - val-aux/math500/score/worst@4/std:np.float64(0.07327313836660543) - val-aux/math500/score/maj@4/mean:np.float64(0.7319580000000001) - val-aux/math500/score/maj@4/std:np.float64(0.10589028816598083) - val-core/math500/acc/mean@4:np.float64(0.721) - val-aux/math500/acc/std@4:np.float64(0.11640638795573721) - val-aux/math500/acc/best@2/mean:np.float64(0.773032) - val-aux/math500/acc/best@2/std:np.float64(0.09596304830570421) - val-aux/math500/acc/worst@2/mean:np.float64(0.668676) - val-aux/math500/acc/worst@2/std:np.float64(0.10389300033190978) - val-aux/math500/acc/maj@2/mean:np.float64(0.72072) - val-aux/math500/acc/maj@2/std:np.float64(0.11624054247674342) - val-core/math500/acc/best@4/mean:np.float64(0.811558) - val-core/math500/acc/best@4/std:np.float64(0.06006813497870701) - val-aux/math500/acc/worst@4/mean:np.float64(0.624584) - val-aux/math500/acc/worst@4/std:np.float64(0.07327313836660543) - val-core/math500/acc/maj@4/mean:np.float64(0.7319580000000001) - val-core/math500/acc/maj@4/std:np.float64(0.10589028816598083) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.07702702702702703 - val-aux/aime2024/response_length/clip_ratio:0.18541666666666667 - val-aux/aime2025/response_length/clip_ratio:0.1125 - val-aux/math500/response_length/clip_ratio:0.0425 - training/global_step:300 - training/epoch:0 - critic/score/mean:0.5880101919174194 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6061849594116211 - critic/rewards/max:1.1967217922210693 - critic/rewards/min:-0.07380782812833786 - critic/advantages/mean:-0.009003475308418274 - critic/advantages/max:2.474719524383545 - critic/advantages/min:-2.474832057952881 - critic/returns/mean:-0.009003475308418274 - critic/returns/max:2.474719524383545 - critic/returns/min:-2.474832057952881 - response_length/mean:1443.130126953125 - response_length/max:8192.0 - response_length/min:159.0 - response_length/clip_ratio:0.04719387739896774 - response_length_non_aborted/mean:1443.130126953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:159.0 - response_length_non_aborted/clip_ratio:0.04719387739896774 - response/aborted_ratio:0.0 - prompt_length/mean:235.84693908691406 - prompt_length/max:383.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.950196206569672e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3834397345781326) - timing_s/agent_loop/generate_sequences/max:np.float64(34.20810004789382) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.232800270445296) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.20810004789382) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:206 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.86877106316388 - timing_s/reward:0.000117545947432518 - timing_s/old_log_prob:11.856324209831655 - timing_s/ref:26.416005723178387 - timing_s/adv:0.08410855010151863 - timing_s/update_actor:25.55207334831357 - timing_s/save_checkpoint:57.93580631073564 - timing_s/update_weights:33.2767827603966 - timing_s/step:192.38806666620076 - timing_s/testing:137.10548495687544 - timing_s/stop_profile:0.0004100501537322998 - timing_per_token_ms/adv:6.389683199767734e-05 - timing_per_token_ms/update_actor:0.01941177842156194 - timing_per_token_ms/gen:0.032586454704611995 - timing_per_token_ms/ref:0.020068103393844336 - perf/total_num_tokens:1842909 - perf/time_per_step:192.38806666620076 - perf/throughput:2394.780809349137 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:160.0 - frontier/mean_score:2.2724656249999997 - frontier/mean_frontier_pct:0.06143881030881311 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.11 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.3 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.51 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.51 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.4659 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:1.7 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.3 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.0569999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9299999999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.3 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.9299999999999997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.51 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.6569999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:32.0 - frontier/cluster_63/score:2.0569999999999995 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:300.0 - cluster/prob_snapshot/cluster_0:0.021383711799820956 - cluster/prob_snapshot/cluster_1:0.0158143206236618 - cluster/prob_snapshot/cluster_2:0.011688845678358721 - cluster/prob_snapshot/cluster_3:0.021383711799820956 - cluster/prob_snapshot/cluster_4:0.017258236854517876 - cluster/prob_snapshot/cluster_5:0.017258236854517876 - cluster/prob_snapshot/cluster_6:0.0158143206236618 - cluster/prob_snapshot/cluster_7:0.013751583151010261 - cluster/prob_snapshot/cluster_8:0.016955014446038103 - cluster/prob_snapshot/cluster_9:0.013751583151010261 - cluster/prob_snapshot/cluster_10:0.013751583151010261 - cluster/prob_snapshot/cluster_11:0.0158143206236618 - cluster/prob_snapshot/cluster_12:0.013751583151010261 - cluster/prob_snapshot/cluster_13:0.013751583151010261 - cluster/prob_snapshot/cluster_14:0.013751583151010261 - cluster/prob_snapshot/cluster_15:0.0158143206236618 - cluster/prob_snapshot/cluster_16:0.013751583151010261 - cluster/prob_snapshot/cluster_17:0.017258236854517876 - cluster/prob_snapshot/cluster_18:0.017258236854517876 - cluster/prob_snapshot/cluster_19:0.0158143206236618 - cluster/prob_snapshot/cluster_20:0.01826897821611713 - cluster/prob_snapshot/cluster_21:0.013751583151010261 - cluster/prob_snapshot/cluster_22:0.011688845678358721 - cluster/prob_snapshot/cluster_23:0.01826897821611713 - cluster/prob_snapshot/cluster_24:0.01993979556896488 - cluster/prob_snapshot/cluster_25:0.010244929447502644 - cluster/prob_snapshot/cluster_26:0.013132761909214799 - cluster/prob_snapshot/cluster_27:0.021383711799820956 - cluster/prob_snapshot/cluster_28:0.013751583151010261 - cluster/prob_snapshot/cluster_29:0.011688845678358721 - cluster/prob_snapshot/cluster_30:0.011688845678358721 - cluster/prob_snapshot/cluster_31:0.013751583151010261 - cluster/prob_snapshot/cluster_32:0.0158143206236618 - cluster/prob_snapshot/cluster_33:0.013751583151010261 - cluster/prob_snapshot/cluster_34:0.0158143206236618 - cluster/prob_snapshot/cluster_35:0.01414350327081405 - cluster/prob_snapshot/cluster_36:0.0158143206236618 - cluster/prob_snapshot/cluster_37:0.013751583151010261 - cluster/prob_snapshot/cluster_38:0.02014606931623003 - cluster/prob_snapshot/cluster_39:0.0158143206236618 - cluster/prob_snapshot/cluster_40:0.02427154426153311 - cluster/prob_snapshot/cluster_41:0.0158143206236618 - cluster/prob_snapshot/cluster_42:0.0158143206236618 - cluster/prob_snapshot/cluster_43:0.02014606931623003 - cluster/prob_snapshot/cluster_44:0.01897649716923661 - cluster/prob_snapshot/cluster_45:0.013751583151010261 - cluster/prob_snapshot/cluster_46:0.01826897821611713 - cluster/prob_snapshot/cluster_47:0.017258236854517876 - cluster/prob_snapshot/cluster_48:0.0158143206236618 - cluster/prob_snapshot/cluster_49:0.013751583151010261 - cluster/prob_snapshot/cluster_50:0.010244929447502644 - cluster/prob_snapshot/cluster_51:0.013751583151010261 - cluster/prob_snapshot/cluster_52:0.0158143206236618 - cluster/prob_snapshot/cluster_53:0.017258236854517876 - cluster/prob_snapshot/cluster_54:0.013751583151010261 - cluster/prob_snapshot/cluster_55:0.01826897821611713 - cluster/prob_snapshot/cluster_56:0.0158143206236618 - cluster/prob_snapshot/cluster_57:0.0158143206236618 - cluster/prob_snapshot/cluster_58:0.0158143206236618 - cluster/prob_snapshot/cluster_59:0.0158143206236618 - cluster/prob_snapshot/cluster_60:0.011688845678358721 - cluster/prob_snapshot/cluster_61:0.013751583151010261 - cluster/prob_snapshot/cluster_62:0.0158143206236618 - cluster/prob_snapshot/cluster_63:0.01414350327081405
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 301/800 [10:06:57<23:13:14, 167.52s/it]
[36m(TaskRunner pid=2823680)[0m step:301 - global_seqlen/min:386373 - global_seqlen/max:512032 - global_seqlen/minmax_diff:125659 - global_seqlen/balanced_min:428862 - global_seqlen/balanced_max:428965 - global_seqlen/mean:428928.5 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.19249852477451382) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011961961165070534 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.023948880232637748) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0006179325814034952) - actor/ppo_kl:np.float64(-1.4831788793463088e-06) - actor/pg_clipfrac_lower:np.float64(8.439698099802785e-06) - actor/grad_norm:np.float64(0.3724328726530075) - perf/mfu/actor:np.float64(0.22941165595442403) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.76998138427734) - actor/lr:np.float64(1e-06) - training/global_step:301 - training/epoch:0 - critic/score/mean:0.5991848111152649 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6100953817367554 - critic/rewards/max:1.3383160829544067 - critic/rewards/min:-0.05941569060087204 - critic/advantages/mean:-0.005656649824231863 - critic/advantages/max:2.474541664123535 - critic/advantages/min:-2.4748330116271973 - critic/returns/mean:-0.005656649824231863 - critic/returns/max:2.474541664123535 - critic/returns/min:-2.4748330116271973 - response_length/mean:1337.6385498046875 - response_length/max:8192.0 - response_length/min:159.0 - response_length/clip_ratio:0.04211956635117531 - response_length_non_aborted/mean:1337.6385498046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:159.0 - response_length_non_aborted/clip_ratio:0.04211956635117531 - response/aborted_ratio:0.0 - prompt_length/mean:245.6086883544922 - prompt_length/max:378.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.690225124359131e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.33286446146667) - timing_s/agent_loop/generate_sequences/max:np.float64(33.29263671115041) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.559308385581062) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.29263671115041) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.07147904206067 - timing_s/reward:0.0001774616539478302 - timing_s/old_log_prob:11.482065783813596 - timing_s/ref:24.99538830295205 - timing_s/adv:0.07219252549111843 - timing_s/update_actor:22.560723530128598 - timing_s/update_weights:30.055881910957396 - timing_s/step:125.02314269542694 - timing_s/stop_profile:7.30324536561966e-05 - timing_per_token_ms/adv:6.195347472355629e-05 - timing_per_token_ms/update_actor:0.019360940837856118 - timing_per_token_ms/gen:0.035623573179191785 - timing_per_token_ms/ref:0.02145029761596201 - perf/total_num_tokens:1715714 - perf/time_per_step:125.02314269542694 - perf/throughput:3430.792817653985 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:196.0 - frontier/mean_score:2.2925703124999997 - frontier/mean_frontier_pct:0.08381502747647024 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.11 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.51 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.6569999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.51 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.3 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.9 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.4659 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.7598999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9299999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:16.0 - frontier/cluster_30/score:2.09 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.3 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.3 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.0569999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.3 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:16.0 - frontier/cluster_38/score:2.9299999999999997 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:32.0 - frontier/cluster_40/score:3.53 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.91 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.9509999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:32.0 - frontier/cluster_44/score:2.7598999999999996 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.49 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.51 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.7598999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.7398999999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:301.0 - cluster/prob_snapshot/cluster_0:0.021196187412463496 - cluster/prob_snapshot/cluster_1:0.017106890805557357 - cluster/prob_snapshot/cluster_2:0.011586340386234066 - cluster/prob_snapshot/cluster_3:0.021196187412463496 - cluster/prob_snapshot/cluster_4:0.01810876847424936 - cluster/prob_snapshot/cluster_5:0.017106890805557357 - cluster/prob_snapshot/cluster_6:0.015675636993140205 - cluster/prob_snapshot/cluster_7:0.019764933600046348 - cluster/prob_snapshot/cluster_8:0.016806327504949755 - cluster/prob_snapshot/cluster_9:0.013630988689687137 - cluster/prob_snapshot/cluster_10:0.013630988689687137 - cluster/prob_snapshot/cluster_11:0.015675636993140205 - cluster/prob_snapshot/cluster_12:0.013630988689687137 - cluster/prob_snapshot/cluster_13:0.013630988689687137 - cluster/prob_snapshot/cluster_14:0.013630988689687137 - cluster/prob_snapshot/cluster_15:0.015675636993140205 - cluster/prob_snapshot/cluster_16:0.013630988689687137 - cluster/prob_snapshot/cluster_17:0.017106890805557357 - cluster/prob_snapshot/cluster_18:0.017106890805557357 - cluster/prob_snapshot/cluster_19:0.015675636993140205 - cluster/prob_snapshot/cluster_20:0.01810876847424936 - cluster/prob_snapshot/cluster_21:0.013630988689687137 - cluster/prob_snapshot/cluster_22:0.010155086573816917 - cluster/prob_snapshot/cluster_23:0.01881008284233376 - cluster/prob_snapshot/cluster_24:0.019969398430391652 - cluster/prob_snapshot/cluster_25:0.010155086573816917 - cluster/prob_snapshot/cluster_26:0.013017594198651216 - cluster/prob_snapshot/cluster_27:0.021196187412463496 - cluster/prob_snapshot/cluster_28:0.013630988689687137 - cluster/prob_snapshot/cluster_29:0.010155086573816917 - cluster/prob_snapshot/cluster_30:0.014244383180723057 - cluster/prob_snapshot/cluster_31:0.015675636993140205 - cluster/prob_snapshot/cluster_32:0.015675636993140205 - cluster/prob_snapshot/cluster_33:0.013630988689687137 - cluster/prob_snapshot/cluster_34:0.015675636993140205 - cluster/prob_snapshot/cluster_35:0.014019471867343217 - cluster/prob_snapshot/cluster_36:0.015675636993140205 - cluster/prob_snapshot/cluster_37:0.013630988689687137 - cluster/prob_snapshot/cluster_38:0.019969398430391652 - cluster/prob_snapshot/cluster_39:0.015675636993140205 - cluster/prob_snapshot/cluster_40:0.024058695037297795 - cluster/prob_snapshot/cluster_41:0.015675636993140205 - cluster/prob_snapshot/cluster_42:0.013017594198651216 - cluster/prob_snapshot/cluster_43:0.02011252381163337 - cluster/prob_snapshot/cluster_44:0.01881008284233376 - cluster/prob_snapshot/cluster_45:0.013630988689687137 - cluster/prob_snapshot/cluster_46:0.01810876847424936 - cluster/prob_snapshot/cluster_47:0.017106890805557357 - cluster/prob_snapshot/cluster_48:0.015675636993140205 - cluster/prob_snapshot/cluster_49:0.013630988689687137 - cluster/prob_snapshot/cluster_50:0.010155086573816917 - cluster/prob_snapshot/cluster_51:0.015675636993140205 - cluster/prob_snapshot/cluster_52:0.015675636993140205 - cluster/prob_snapshot/cluster_53:0.017106890805557357 - cluster/prob_snapshot/cluster_54:0.013630988689687137 - cluster/prob_snapshot/cluster_55:0.01881008284233376 - cluster/prob_snapshot/cluster_56:0.015675636993140205 - cluster/prob_snapshot/cluster_57:0.015675636993140205 - cluster/prob_snapshot/cluster_58:0.015675636993140205 - cluster/prob_snapshot/cluster_59:0.017106890805557357 - cluster/prob_snapshot/cluster_60:0.011586340386234066 - cluster/prob_snapshot/cluster_61:0.011586340386234066 - cluster/prob_snapshot/cluster_62:0.015675636993140205 - cluster/prob_snapshot/cluster_63:0.011858278610593322
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 21:39:24,913:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 302/800 [10:08:54<21:06:18, 152.57s/it]
[36m(TaskRunner pid=2823680)[0m step:302 - global_seqlen/min:337975 - global_seqlen/max:471041 - global_seqlen/minmax_diff:133066 - global_seqlen/balanced_min:411602 - global_seqlen/balanced_max:411705 - global_seqlen/mean:411651.75 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.1710247393261562) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010213350877165794 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.07634719858469907) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008263270452865053) - actor/ppo_kl:np.float64(0.00024433736217225547) - actor/pg_clipfrac_lower:np.float64(1.1198563117161145e-05) - actor/grad_norm:np.float64(0.4608815014362335) - perf/mfu/actor:np.float64(0.23045459112328862) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.67522048950195) - actor/lr:np.float64(1e-06) - training/global_step:302 - training/epoch:0 - critic/score/mean:0.7207446694374084 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.7314388751983643 - critic/rewards/max:1.209036111831665 - critic/rewards/min:-0.08285505324602127 - critic/advantages/mean:-0.06906786561012268 - critic/advantages/max:2.47483229637146 - critic/advantages/min:-2.4748101234436035 - critic/returns/mean:-0.06906786561012268 - critic/returns/max:2.47483229637146 - critic/returns/min:-2.4748101234436035 - response_length/mean:1197.021240234375 - response_length/max:8192.0 - response_length/min:149.0 - response_length/clip_ratio:0.03058510646224022 - response_length_non_aborted/mean:1197.021240234375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:149.0 - response_length_non_aborted/clip_ratio:0.03058510646224022 - response/aborted_ratio:0.0 - prompt_length/mean:241.87234497070312 - prompt_length/max:535.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.056273847818375e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1957632238045335) - timing_s/agent_loop/generate_sequences/max:np.float64(33.23424221202731) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.133890087889085) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.23424221202731) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:206 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.53171709086746 - timing_s/reward:0.00013164803385734558 - timing_s/old_log_prob:10.21793067548424 - timing_s/ref:21.38042983971536 - timing_s/adv:0.08377422019839287 - timing_s/update_actor:21.330905354581773 - timing_s/update_weights:27.61054262984544 - timing_s/step:116.52297888882458 - timing_s/stop_profile:5.684327334165573e-05 - timing_per_token_ms/adv:7.742190752941909e-05 - timing_per_token_ms/update_actor:0.019713455738175914 - timing_per_token_ms/gen:0.03947266829326726 - timing_per_token_ms/ref:0.01975922495094059 - perf/total_num_tokens:1646607 - perf/time_per_step:116.52297888882458 - perf/throughput:3532.7945948992597 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:230.0 - frontier/mean_score:2.3723712499999996 - frontier/mean_frontier_pct:0.09778241102269308 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.11 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.51 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:16.0 - frontier/cluster_3/score:3.11 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.6569999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:2.51 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.9 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.62613 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.7598999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9299999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.49 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.3629999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:0.0 - frontier/cluster_32/score:2.3 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.0569999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.3509999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:48.0 - frontier/cluster_40/score:3.9709999999999996 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:3.11 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:16.0 - frontier/cluster_42/score:1.91 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.9509999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.4319299999999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:16.0 - frontier/cluster_46/score:2.6569999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:32.0 - frontier/cluster_50/score:1.9429999999999998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.3 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.51 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.7598999999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:1.7 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.3 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.7398999999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:302.0 - cluster/prob_snapshot/cluster_0:0.020483197981766135 - cluster/prob_snapshot/cluster_1:0.0165314556058627 - cluster/prob_snapshot/cluster_2:0.011196603398393064 - cluster/prob_snapshot/cluster_3:0.020483197981766135 - cluster/prob_snapshot/cluster_4:0.017499632487959042 - cluster/prob_snapshot/cluster_5:0.017499632487959042 - cluster/prob_snapshot/cluster_6:0.0165314556058627 - cluster/prob_snapshot/cluster_7:0.019100088150199934 - cluster/prob_snapshot/cluster_8:0.01729631534271881 - cluster/prob_snapshot/cluster_9:0.015148345774296499 - cluster/prob_snapshot/cluster_10:0.019100088150199934 - cluster/prob_snapshot/cluster_11:0.015148345774296499 - cluster/prob_snapshot/cluster_12:0.013172474586344783 - cluster/prob_snapshot/cluster_13:0.013172474586344783 - cluster/prob_snapshot/cluster_14:0.015148345774296499 - cluster/prob_snapshot/cluster_15:0.015148345774296499 - cluster/prob_snapshot/cluster_16:0.015148345774296499 - cluster/prob_snapshot/cluster_17:0.0165314556058627 - cluster/prob_snapshot/cluster_18:0.0165314556058627 - cluster/prob_snapshot/cluster_19:0.015148345774296499 - cluster/prob_snapshot/cluster_20:0.017499632487959042 - cluster/prob_snapshot/cluster_21:0.013172474586344783 - cluster/prob_snapshot/cluster_22:0.009813493566826863 - cluster/prob_snapshot/cluster_23:0.01817735630542648 - cluster/prob_snapshot/cluster_24:0.019297675268995104 - cluster/prob_snapshot/cluster_25:0.009813493566826863 - cluster/prob_snapshot/cluster_26:0.012579713229959266 - cluster/prob_snapshot/cluster_27:0.020483197981766135 - cluster/prob_snapshot/cluster_28:0.015148345774296499 - cluster/prob_snapshot/cluster_29:0.009813493566826863 - cluster/prob_snapshot/cluster_30:0.015563278723766357 - cluster/prob_snapshot/cluster_31:0.0165314556058627 - cluster/prob_snapshot/cluster_32:0.015148345774296499 - cluster/prob_snapshot/cluster_33:0.013172474586344783 - cluster/prob_snapshot/cluster_34:0.015148345774296499 - cluster/prob_snapshot/cluster_35:0.013547890112055606 - cluster/prob_snapshot/cluster_36:0.0165314556058627 - cluster/prob_snapshot/cluster_37:0.013172474586344783 - cluster/prob_snapshot/cluster_38:0.01548424387624829 - cluster/prob_snapshot/cluster_39:0.015148345774296499 - cluster/prob_snapshot/cluster_40:0.026153948291187563 - cluster/prob_snapshot/cluster_41:0.020483197981766135 - cluster/prob_snapshot/cluster_42:0.012579713229959266 - cluster/prob_snapshot/cluster_43:0.019435986252151725 - cluster/prob_snapshot/cluster_44:0.02260350535355712 - cluster/prob_snapshot/cluster_45:0.013172474586344783 - cluster/prob_snapshot/cluster_46:0.017499632487959042 - cluster/prob_snapshot/cluster_47:0.0165314556058627 - cluster/prob_snapshot/cluster_48:0.015148345774296499 - cluster/prob_snapshot/cluster_49:0.013172474586344783 - cluster/prob_snapshot/cluster_50:0.012797059060633956 - cluster/prob_snapshot/cluster_51:0.015148345774296499 - cluster/prob_snapshot/cluster_52:0.015148345774296499 - cluster/prob_snapshot/cluster_53:0.0165314556058627 - cluster/prob_snapshot/cluster_54:0.013172474586344783 - cluster/prob_snapshot/cluster_55:0.01817735630542648 - cluster/prob_snapshot/cluster_56:0.015148345774296499 - cluster/prob_snapshot/cluster_57:0.015148345774296499 - cluster/prob_snapshot/cluster_58:0.015148345774296499 - cluster/prob_snapshot/cluster_59:0.0165314556058627 - cluster/prob_snapshot/cluster_60:0.011196603398393064 - cluster/prob_snapshot/cluster_61:0.011196603398393064 - cluster/prob_snapshot/cluster_62:0.015148345774296499 - cluster/prob_snapshot/cluster_63:0.011459394266390642
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 303/800 [10:11:05<20:09:29, 146.01s/it]
[36m(TaskRunner pid=2823680)[0m step:303 - global_seqlen/min:378633 - global_seqlen/max:570134 - global_seqlen/minmax_diff:191501 - global_seqlen/balanced_min:474157 - global_seqlen/balanced_max:474384 - global_seqlen/mean:474294.25 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.17367236556485297) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010984113439917564 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.039563665675814264) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008291712818754604) - actor/ppo_kl:np.float64(-1.3967454624435049e-05) - actor/pg_clipfrac_lower:np.float64(2.1933036878181154e-05) - actor/grad_norm:np.float64(0.4404416004052529) - perf/mfu/actor:np.float64(0.23861901141616995) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.95011520385742) - actor/lr:np.float64(1e-06) - training/global_step:303 - training/epoch:0 - critic/score/mean:0.6012499928474426 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6213809251785278 - critic/rewards/max:1.1923199892044067 - critic/rewards/min:-0.07246766239404678 - critic/advantages/mean:-0.05896954610943794 - critic/advantages/max:2.474778890609741 - critic/advantages/min:-2.474825382232666 - critic/returns/mean:-0.05896954610943794 - critic/returns/max:2.474778890609741 - critic/returns/min:-2.474825382232666 - response_length/mean:1433.44873046875 - response_length/max:8192.0 - response_length/min:170.0 - response_length/clip_ratio:0.05874999985098839 - response_length_non_aborted/mean:1433.44873046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:170.0 - response_length_non_aborted/clip_ratio:0.05874999985098839 - response/aborted_ratio:0.0 - prompt_length/mean:247.11000061035156 - prompt_length/max:816.0 - prompt_length/min:187.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.169802069664001e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5361388912424445) - timing_s/agent_loop/generate_sequences/max:np.float64(35.732112617231905) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.77446913215499) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.732112617231905) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:38.18144590873271 - timing_s/reward:0.00013235025107860565 - timing_s/old_log_prob:12.29404357355088 - timing_s/ref:24.859084967523813 - timing_s/adv:0.07586100324988365 - timing_s/update_actor:24.038849470205605 - timing_s/update_weights:30.633762939833105 - timing_s/step:130.47063758037984 - timing_s/stop_profile:7.489137351512909e-05 - timing_per_token_ms/adv:5.642543235239742e-05 - timing_per_token_ms/update_actor:0.017880101982603706 - timing_per_token_ms/gen:0.03329509156564955 - timing_per_token_ms/ref:0.018490193341592354 - perf/total_num_tokens:1897177 - perf/time_per_step:130.47063758037984 - perf/throughput:3635.256627820176 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:258.0 - frontier/mean_score:2.42271078125 - frontier/mean_frontier_pct:0.11410152503335752 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.11 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.51 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.6769999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.6569999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:16.0 - frontier/cluster_6/score:2.51 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.9299999999999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:32.0 - frontier/cluster_8/score:2.62613 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.51 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.49 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.7598999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9299999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.9429999999999998 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.3629999999999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:2.51 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.0569999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.3509999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:4.2797 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:16.0 - frontier/cluster_41/score:3.0769999999999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.9509999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.4319299999999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:2.2600999999999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.51 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.6569999999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.8319299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.51 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.51 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.7398999999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:303.0 - cluster/prob_snapshot/cluster_0:0.020057594317935055 - cluster/prob_snapshot/cluster_1:0.0161879619736389 - cluster/prob_snapshot/cluster_2:0.010963958308839098 - cluster/prob_snapshot/cluster_3:0.023714396883294915 - cluster/prob_snapshot/cluster_4:0.017136021897991457 - cluster/prob_snapshot/cluster_5:0.017136021897991457 - cluster/prob_snapshot/cluster_6:0.0161879619736389 - cluster/prob_snapshot/cluster_7:0.018896704614646206 - cluster/prob_snapshot/cluster_8:0.01693692931387742 - cluster/prob_snapshot/cluster_9:0.014833590653135248 - cluster/prob_snapshot/cluster_10:0.0187032229974314 - cluster/prob_snapshot/cluster_11:0.014833590653135248 - cluster/prob_snapshot/cluster_12:0.012898774480987174 - cluster/prob_snapshot/cluster_13:0.012898774480987174 - cluster/prob_snapshot/cluster_14:0.014833590653135248 - cluster/prob_snapshot/cluster_15:0.014833590653135248 - cluster/prob_snapshot/cluster_16:0.014833590653135248 - cluster/prob_snapshot/cluster_17:0.0161879619736389 - cluster/prob_snapshot/cluster_18:0.0161879619736389 - cluster/prob_snapshot/cluster_19:0.014833590653135248 - cluster/prob_snapshot/cluster_20:0.017136021897991457 - cluster/prob_snapshot/cluster_21:0.012898774480987174 - cluster/prob_snapshot/cluster_22:0.009609586988335445 - cluster/prob_snapshot/cluster_23:0.017799663845038248 - cluster/prob_snapshot/cluster_24:0.018896704614646206 - cluster/prob_snapshot/cluster_25:0.009609586988335445 - cluster/prob_snapshot/cluster_26:0.01231832962934275 - cluster/prob_snapshot/cluster_27:0.020057594317935055 - cluster/prob_snapshot/cluster_28:0.014833590653135248 - cluster/prob_snapshot/cluster_29:0.012531159408279037 - cluster/prob_snapshot/cluster_30:0.015239902049286342 - cluster/prob_snapshot/cluster_31:0.0161879619736389 - cluster/prob_snapshot/cluster_32:0.0161879619736389 - cluster/prob_snapshot/cluster_33:0.014833590653135248 - cluster/prob_snapshot/cluster_34:0.014833590653135248 - cluster/prob_snapshot/cluster_35:0.013266389553695305 - cluster/prob_snapshot/cluster_36:0.0161879619736389 - cluster/prob_snapshot/cluster_37:0.012898774480987174 - cluster/prob_snapshot/cluster_38:0.01516250940240042 - cluster/prob_snapshot/cluster_39:0.014833590653135248 - cluster/prob_snapshot/cluster_40:0.027601442573140404 - cluster/prob_snapshot/cluster_41:0.019844764538998763 - cluster/prob_snapshot/cluster_42:0.010557646912688002 - cluster/prob_snapshot/cluster_43:0.01903214174669657 - cluster/prob_snapshot/cluster_44:0.02213384555226715 - cluster/prob_snapshot/cluster_45:0.012898774480987174 - cluster/prob_snapshot/cluster_46:0.017799663845038248 - cluster/prob_snapshot/cluster_47:0.0161879619736389 - cluster/prob_snapshot/cluster_48:0.014833590653135248 - cluster/prob_snapshot/cluster_49:0.012898774480987174 - cluster/prob_snapshot/cluster_50:0.014576260102239552 - cluster/prob_snapshot/cluster_51:0.014833590653135248 - cluster/prob_snapshot/cluster_52:0.0161879619736389 - cluster/prob_snapshot/cluster_53:0.017136021897991457 - cluster/prob_snapshot/cluster_54:0.012898774480987174 - cluster/prob_snapshot/cluster_55:0.018264213207971 - cluster/prob_snapshot/cluster_56:0.0161879619736389 - cluster/prob_snapshot/cluster_57:0.014833590653135248 - cluster/prob_snapshot/cluster_58:0.014833590653135248 - cluster/prob_snapshot/cluster_59:0.0161879619736389 - cluster/prob_snapshot/cluster_60:0.010963958308839098 - cluster/prob_snapshot/cluster_61:0.013479219332631595 - cluster/prob_snapshot/cluster_62:0.0161879619736389 - cluster/prob_snapshot/cluster_63:0.011221288859734788
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 304/800 [10:13:17<19:31:47, 141.75s/it]
[36m(TaskRunner pid=2823680)[0m step:304 - global_seqlen/min:379847 - global_seqlen/max:579481 - global_seqlen/minmax_diff:199634 - global_seqlen/balanced_min:448412 - global_seqlen/balanced_max:448633 - global_seqlen/mean:448560.25 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.17384929962456228) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010697852820158005 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.010880088433623314) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0007718196612404427) - actor/ppo_kl:np.float64(0.0001836306957306988) - actor/pg_clipfrac_lower:np.float64(1.1646115617622853e-05) - actor/grad_norm:np.float64(0.37803165041483366) - perf/mfu/actor:np.float64(0.21992222314896548) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(113.50696182250977) - actor/lr:np.float64(1e-06) - training/global_step:304 - training/epoch:0 - critic/score/mean:0.5899999737739563 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6003598570823669 - critic/rewards/max:1.5292482376098633 - critic/rewards/min:-0.302047461271286 - critic/advantages/mean:-0.08773532509803772 - critic/advantages/max:2.4746651649475098 - critic/advantages/min:-2.4748198986053467 - critic/returns/mean:-0.08773532509803772 - critic/returns/max:2.4746651649475098 - critic/returns/min:-2.4748198986053467 - response_length/mean:1468.62744140625 - response_length/max:8192.0 - response_length/min:217.0 - response_length/clip_ratio:0.042500000447034836 - response_length_non_aborted/mean:1468.62744140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:217.0 - response_length_non_aborted/clip_ratio:0.042500000447034836 - response/aborted_ratio:0.0 - prompt_length/mean:231.97999572753906 - prompt_length/max:379.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00016014929860830307 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6515719378367066) - timing_s/agent_loop/generate_sequences/max:np.float64(34.202631524764) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.124556886296887) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.202631524764) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:176 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.03390528075397 - timing_s/reward:0.00015859678387641907 - timing_s/old_log_prob:12.668210914358497 - timing_s/ref:26.75967876985669 - timing_s/adv:0.09370293654501438 - timing_s/update_actor:24.303024454042315 - timing_s/update_weights:31.244561619125307 - timing_s/step:131.54211408179253 - timing_s/stop_profile:5.136057734489441e-05 - timing_per_token_ms/adv:6.887460550495513e-05 - timing_per_token_ms/update_actor:0.017863487352344908 - timing_per_token_ms/gen:0.03066971141487032 - timing_per_token_ms/ref:0.019669205541149773 - perf/total_num_tokens:1794241 - perf/time_per_step:131.54211408179253 - perf/throughput:3410.0124749484144 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:286.0 - frontier/mean_score:2.464447671875 - frontier/mean_frontier_pct:0.12444465434887109 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:15.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.11 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.51 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:16.0 - frontier/cluster_2/score:1.7 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.6769999999999996 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:16.0 - frontier/cluster_4/score:2.6569999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:2.6569999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:2.0569999999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:16.0 - frontier/cluster_7/score:2.9299999999999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.7382909999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.6569999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:2.7598999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9299999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.51 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.9429999999999998 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.5540999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.51 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:2.51 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.0569999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:32.0 - frontier/cluster_38/score:2.3509999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:4.2797 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:3.0538999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.9509999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.4319299999999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.3 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:2.2600999999999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.51 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.51 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.6569999999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:32.0 - frontier/cluster_55/score:2.8319299999999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.51 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:16.0 - frontier/cluster_61/score:2.09 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.7398999999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:304.0 - cluster/prob_snapshot/cluster_0:0.019717906999838598 - cluster/prob_snapshot/cluster_1:0.015913809186364914 - cluster/prob_snapshot/cluster_2:0.01077827713817544 - cluster/prob_snapshot/cluster_3:0.02331277943357123 - cluster/prob_snapshot/cluster_4:0.016845813150665963 - cluster/prob_snapshot/cluster_5:0.016845813150665963 - cluster/prob_snapshot/cluster_6:0.01304171533719228 - cluster/prob_snapshot/cluster_7:0.018576677655796493 - cluster/prob_snapshot/cluster_8:0.017361211342924444 - cluster/prob_snapshot/cluster_9:0.014582374951649124 - cluster/prob_snapshot/cluster_10:0.01838647276512281 - cluster/prob_snapshot/cluster_11:0.014582374951649124 - cluster/prob_snapshot/cluster_12:0.012680326044912283 - cluster/prob_snapshot/cluster_13:0.012680326044912283 - cluster/prob_snapshot/cluster_14:0.015913809186364914 - cluster/prob_snapshot/cluster_15:0.014582374951649124 - cluster/prob_snapshot/cluster_16:0.015913809186364914 - cluster/prob_snapshot/cluster_17:0.016845813150665963 - cluster/prob_snapshot/cluster_18:0.015913809186364914 - cluster/prob_snapshot/cluster_19:0.014582374951649124 - cluster/prob_snapshot/cluster_20:0.016845813150665963 - cluster/prob_snapshot/cluster_21:0.014582374951649124 - cluster/prob_snapshot/cluster_22:0.012318936752632282 - cluster/prob_snapshot/cluster_23:0.017498215925676703 - cluster/prob_snapshot/cluster_24:0.018576677655796493 - cluster/prob_snapshot/cluster_25:0.00944684290345965 - cluster/prob_snapshot/cluster_26:0.012109711372891229 - cluster/prob_snapshot/cluster_27:0.019717906999838598 - cluster/prob_snapshot/cluster_28:0.015913809186364914 - cluster/prob_snapshot/cluster_29:0.012318936752632282 - cluster/prob_snapshot/cluster_30:0.016193410375655228 - cluster/prob_snapshot/cluster_31:0.015913809186364914 - cluster/prob_snapshot/cluster_32:0.015913809186364914 - cluster/prob_snapshot/cluster_33:0.014582374951649124 - cluster/prob_snapshot/cluster_34:0.014582374951649124 - cluster/prob_snapshot/cluster_35:0.01304171533719228 - cluster/prob_snapshot/cluster_36:0.015913809186364914 - cluster/prob_snapshot/cluster_37:0.012680326044912283 - cluster/prob_snapshot/cluster_38:0.014905723265794386 - cluster/prob_snapshot/cluster_39:0.014582374951649124 - cluster/prob_snapshot/cluster_40:0.027133995687205547 - cluster/prob_snapshot/cluster_41:0.01936222385427881 - cluster/prob_snapshot/cluster_42:0.010378846867760702 - cluster/prob_snapshot/cluster_43:0.01870982107926807 - cluster/prob_snapshot/cluster_44:0.0217589956816579 - cluster/prob_snapshot/cluster_45:0.012680326044912283 - cluster/prob_snapshot/cluster_46:0.017498215925676703 - cluster/prob_snapshot/cluster_47:0.015913809186364914 - cluster/prob_snapshot/cluster_48:0.014582374951649124 - cluster/prob_snapshot/cluster_49:0.014582374951649124 - cluster/prob_snapshot/cluster_50:0.014329402447053122 - cluster/prob_snapshot/cluster_51:0.015913809186364914 - cluster/prob_snapshot/cluster_52:0.015913809186364914 - cluster/prob_snapshot/cluster_53:0.016845813150665963 - cluster/prob_snapshot/cluster_54:0.014582374951649124 - cluster/prob_snapshot/cluster_55:0.017954897868184216 - cluster/prob_snapshot/cluster_56:0.016845813150665963 - cluster/prob_snapshot/cluster_57:0.015913809186364914 - cluster/prob_snapshot/cluster_58:0.014582374951649124 - cluster/prob_snapshot/cluster_59:0.015913809186364914 - cluster/prob_snapshot/cluster_60:0.01077827713817544 - cluster/prob_snapshot/cluster_61:0.013250940716933335 - cluster/prob_snapshot/cluster_62:0.016845813150665963 - cluster/prob_snapshot/cluster_63:0.011031249642771437
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 305/800 [10:15:34<19:17:14, 140.27s/it]
[36m(TaskRunner pid=2823680)[0m step:305 - global_seqlen/min:376297 - global_seqlen/max:477108 - global_seqlen/minmax_diff:100811 - global_seqlen/balanced_min:446879 - global_seqlen/balanced_max:447019 - global_seqlen/mean:446948.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.19449744381838374) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011273113079369068 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.061447715963367955) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008450432722586104) - actor/ppo_kl:np.float64(1.89668359970104e-05) - actor/pg_clipfrac_lower:np.float64(7.155235592411676e-06) - actor/grad_norm:np.float64(0.3695927324394385) - perf/mfu/actor:np.float64(0.23117254993662803) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(119.1698112487793) - actor/lr:np.float64(1e-06) - training/global_step:305 - training/epoch:0 - critic/score/mean:0.6544944047927856 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6751900911331177 - critic/rewards/max:1.4938791990280151 - critic/rewards/min:-0.07778291404247284 - critic/advantages/mean:-0.11525123566389084 - critic/advantages/max:2.4733221530914307 - critic/advantages/min:-2.4744839668273926 - critic/returns/mean:-0.11525123566389084 - critic/returns/max:2.4733221530914307 - critic/returns/min:-2.4744839668273926 - response_length/mean:1409.46484375 - response_length/max:8192.0 - response_length/min:210.0 - response_length/clip_ratio:0.04775280877947807 - response_length_non_aborted/mean:1409.46484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:210.0 - response_length_non_aborted/clip_ratio:0.04775280877947807 - response/aborted_ratio:0.0 - prompt_length/mean:243.19100952148438 - prompt_length/max:357.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.110329508781433e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6703297551721334) - timing_s/agent_loop/generate_sequences/max:np.float64(35.12335539981723) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.988662292601475) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.12335539981723) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:192 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.04330033343285 - timing_s/reward:0.0001406511291861534 - timing_s/old_log_prob:11.346628107130527 - timing_s/ref:25.2260712813586 - timing_s/adv:0.08438885118812323 - timing_s/update_actor:23.133644084446132 - timing_s/update_weights:39.36306961160153 - timing_s/step:136.58292122092098 - timing_s/stop_profile:7.375236600637436e-05 - timing_per_token_ms/adv:7.171708731359654e-05 - timing_per_token_ms/update_actor:0.019659914186856305 - timing_per_token_ms/gen:0.036912666407018416 - timing_per_token_ms/ref:0.021438144152847775 - perf/total_num_tokens:1787794 - perf/time_per_step:136.58292122092098 - perf/throughput:3272.360087225452 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:325.0 - frontier/mean_score:2.4852511249999996 - frontier/mean_frontier_pct:0.14237552325021618 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.51 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.4738999999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.7598999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:32.0 - frontier/cluster_5/score:2.7598999999999996 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:2.339899999999999 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.5509999999999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.7382909999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.51 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.6569999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.2319299999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:2.9299999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:16.0 - frontier/cluster_28/score:2.6569999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:32.0 - frontier/cluster_29/score:1.9429999999999998 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.5540999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:2.51 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.0569999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.9456999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:64.0 - frontier/cluster_40/score:4.2797 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:3.0538999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:32.0 - frontier/cluster_43/score:2.9656999999999996 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.4319299999999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.3 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:2.4820699999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.6569999999999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:2.51 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.6569999999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.51 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.51 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:16.0 - frontier/cluster_62/score:2.6569999999999996 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.7398999999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:305.0 - cluster/prob_snapshot/cluster_0:0.01934537903086152 - cluster/prob_snapshot/cluster_1:0.01578059842946455 - cluster/prob_snapshot/cluster_2:0.009367765601554653 - cluster/prob_snapshot/cluster_3:0.0218407254518394 - cluster/prob_snapshot/cluster_4:0.01735174247230247 - cluster/prob_snapshot/cluster_5:0.01735174247230247 - cluster/prob_snapshot/cluster_6:0.014711164249045453 - cluster/prob_snapshot/cluster_7:0.022325460168537297 - cluster/prob_snapshot/cluster_8:0.0172158847227159 - cluster/prob_snapshot/cluster_9:0.014460309317836042 - cluster/prob_snapshot/cluster_10:0.018232563922488922 - cluster/prob_snapshot/cluster_11:0.014460309317836042 - cluster/prob_snapshot/cluster_12:0.012574182015509602 - cluster/prob_snapshot/cluster_13:0.012574182015509602 - cluster/prob_snapshot/cluster_14:0.01578059842946455 - cluster/prob_snapshot/cluster_15:0.014460309317836042 - cluster/prob_snapshot/cluster_16:0.01578059842946455 - cluster/prob_snapshot/cluster_17:0.016704800807604503 - cluster/prob_snapshot/cluster_18:0.01578059842946455 - cluster/prob_snapshot/cluster_19:0.014460309317836042 - cluster/prob_snapshot/cluster_20:0.016704800807604503 - cluster/prob_snapshot/cluster_21:0.014460309317836042 - cluster/prob_snapshot/cluster_22:0.012215817828067577 - cluster/prob_snapshot/cluster_23:0.014032347032938168 - cluster/prob_snapshot/cluster_24:0.018421176652721565 - cluster/prob_snapshot/cluster_25:0.009367765601554653 - cluster/prob_snapshot/cluster_26:0.01200834382481167 - cluster/prob_snapshot/cluster_27:0.01955285303411743 - cluster/prob_snapshot/cluster_28:0.016704800807604503 - cluster/prob_snapshot/cluster_29:0.012215817828067577 - cluster/prob_snapshot/cluster_30:0.016057859142906535 - cluster/prob_snapshot/cluster_31:0.016704800807604503 - cluster/prob_snapshot/cluster_32:0.01578059842946455 - cluster/prob_snapshot/cluster_33:0.014460309317836042 - cluster/prob_snapshot/cluster_34:0.014460309317836042 - cluster/prob_snapshot/cluster_35:0.012932546202951622 - cluster/prob_snapshot/cluster_36:0.01578059842946455 - cluster/prob_snapshot/cluster_37:0.012574182015509602 - cluster/prob_snapshot/cluster_38:0.012232792973788513 - cluster/prob_snapshot/cluster_39:0.014460309317836042 - cluster/prob_snapshot/cluster_40:0.02690686338588822 - cluster/prob_snapshot/cluster_41:0.019200147228582383 - cluster/prob_snapshot/cluster_42:0.01029196797969461 - cluster/prob_snapshot/cluster_43:0.01864562580169841 - cluster/prob_snapshot/cluster_44:0.02157685624224393 - cluster/prob_snapshot/cluster_45:0.012574182015509602 - cluster/prob_snapshot/cluster_46:0.01735174247230247 - cluster/prob_snapshot/cluster_47:0.01578059842946455 - cluster/prob_snapshot/cluster_48:0.014460309317836042 - cluster/prob_snapshot/cluster_49:0.014460309317836042 - cluster/prob_snapshot/cluster_50:0.015604999977617955 - cluster/prob_snapshot/cluster_51:0.016704800807604503 - cluster/prob_snapshot/cluster_52:0.01578059842946455 - cluster/prob_snapshot/cluster_53:0.016704800807604503 - cluster/prob_snapshot/cluster_54:0.014460309317836042 - cluster/prob_snapshot/cluster_55:0.021893857657945937 - cluster/prob_snapshot/cluster_56:0.016704800807604503 - cluster/prob_snapshot/cluster_57:0.01578059842946455 - cluster/prob_snapshot/cluster_58:0.014460309317836042 - cluster/prob_snapshot/cluster_59:0.01578059842946455 - cluster/prob_snapshot/cluster_60:0.01068805471318316 - cluster/prob_snapshot/cluster_61:0.014856396051324591 - cluster/prob_snapshot/cluster_62:0.016704800807604503 - cluster/prob_snapshot/cluster_63:0.010938909644392576
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 306/800 [10:17:34<18:25:51, 134.31s/it]
[36m(TaskRunner pid=2823680)[0m step:306 - global_seqlen/min:341326 - global_seqlen/max:485720 - global_seqlen/minmax_diff:144394 - global_seqlen/balanced_min:406226 - global_seqlen/balanced_max:406337 - global_seqlen/mean:406287.0 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.17282167646102609) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.012591041624546051 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.007302233643713407) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012923589213460218) - actor/ppo_kl:np.float64(0.0006183688973783319) - actor/pg_clipfrac_lower:np.float64(9.318224947492126e-05) - actor/grad_norm:np.float64(0.41304437472270084) - perf/mfu/actor:np.float64(0.20405489623207623) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.87264251708984) - actor/lr:np.float64(1e-06) - training/global_step:306 - training/epoch:0 - critic/score/mean:0.6474999785423279 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6611784100532532 - critic/rewards/max:1.837256669998169 - critic/rewards/min:-0.10715904831886292 - critic/advantages/mean:-0.09029535949230194 - critic/advantages/max:2.4747135639190674 - critic/advantages/min:-2.4748458862304688 - critic/returns/mean:-0.09029535949230194 - critic/returns/max:2.4747135639190674 - critic/returns/min:-2.4748458862304688 - response_length/mean:1265.7462158203125 - response_length/max:8192.0 - response_length/min:146.0 - response_length/clip_ratio:0.03750000149011612 - response_length_non_aborted/mean:1265.7462158203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:146.0 - response_length_non_aborted/clip_ratio:0.03750000149011612 - response/aborted_ratio:0.0 - prompt_length/mean:234.77000427246094 - prompt_length/max:414.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.13361257314682e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2099433997645974) - timing_s/agent_loop/generate_sequences/max:np.float64(32.208925342187285) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.993364988849862) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.208925342187285) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:225 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.77201319113374 - timing_s/reward:0.00013498682528734207 - timing_s/old_log_prob:11.774723675101995 - timing_s/ref:21.301311398856342 - timing_s/adv:0.0951442252844572 - timing_s/update_actor:23.816794734448195 - timing_s/update_weights:29.018899864517152 - timing_s/step:120.20727969799191 - timing_s/stop_profile:4.949700087308884e-05 - timing_per_token_ms/adv:7.925957589967553e-05 - timing_per_token_ms/update_actor:0.019840500506449192 - timing_per_token_ms/gen:0.03335187956426272 - timing_per_token_ms/ref:0.01774498559983634 - perf/total_num_tokens:1625148 - perf/time_per_step:120.20727969799191 - perf/throughput:3379.8868173437845 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:353.0 - frontier/mean_score:2.5162425714285708 - frontier/mean_frontier_pct:0.1579170916146645 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.51 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:32.0 - frontier/cluster_2/score:1.49 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.4738999999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.7598999999999996 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:32.0 - frontier/cluster_6/score:2.339899999999999 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.5509999999999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.7382909999999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:2.51 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.0569999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.6569999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.2319299999999993 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.5509999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.49 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.0769999999999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.7598999999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:32.0 - frontier/cluster_30/score:2.5540999999999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:2.51 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:32.0 - frontier/cluster_35/score:2.0569999999999995 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.9456999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.4957899999999995 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:3.0538999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.9759899999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.4319299999999995 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:3.11 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:2.4820699999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.6569999999999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.0569999999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.6569999999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.51 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.51 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.6569999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:3.3598999999999997 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.7398999999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:306.0 - cluster/prob_snapshot/cluster_0:0.019410398025950536 - cluster/prob_snapshot/cluster_1:0.015833636348760432 - cluster/prob_snapshot/cluster_2:0.009399250262809978 - cluster/prob_snapshot/cluster_3:0.02191413119998361 - cluster/prob_snapshot/cluster_4:0.01741006093981829 - cluster/prob_snapshot/cluster_5:0.014079509153740583 - cluster/prob_snapshot/cluster_6:0.014760607845603397 - cluster/prob_snapshot/cluster_7:0.022400495089421633 - cluster/prob_snapshot/cluster_8:0.017273746578120936 - cluster/prob_snapshot/cluster_9:0.015833636348760432 - cluster/prob_snapshot/cluster_10:0.018293842793388547 - cluster/prob_snapshot/cluster_11:0.014508909801652986 - cluster/prob_snapshot/cluster_12:0.012616443305785206 - cluster/prob_snapshot/cluster_13:0.012616443305785206 - cluster/prob_snapshot/cluster_14:0.016760944931735643 - cluster/prob_snapshot/cluster_15:0.014508909801652986 - cluster/prob_snapshot/cluster_16:0.01297601194000008 - cluster/prob_snapshot/cluster_17:0.016760944931735643 - cluster/prob_snapshot/cluster_18:0.015833636348760432 - cluster/prob_snapshot/cluster_19:0.014508909801652986 - cluster/prob_snapshot/cluster_20:0.016760944931735643 - cluster/prob_snapshot/cluster_21:0.014508909801652986 - cluster/prob_snapshot/cluster_22:0.012256874671570326 - cluster/prob_snapshot/cluster_23:0.014079509153740583 - cluster/prob_snapshot/cluster_24:0.022400495089421633 - cluster/prob_snapshot/cluster_25:0.009399250262809978 - cluster/prob_snapshot/cluster_26:0.01204870335702487 - cluster/prob_snapshot/cluster_27:0.019410398025950536 - cluster/prob_snapshot/cluster_28:0.01741006093981829 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016111828923652996 - cluster/prob_snapshot/cluster_31:0.016760944931735643 - cluster/prob_snapshot/cluster_32:0.015833636348760432 - cluster/prob_snapshot/cluster_33:0.014508909801652986 - cluster/prob_snapshot/cluster_34:0.014508909801652986 - cluster/prob_snapshot/cluster_35:0.01297601194000008 - cluster/prob_snapshot/cluster_36:0.015833636348760432 - cluster/prob_snapshot/cluster_37:0.010723976809917424 - cluster/prob_snapshot/cluster_38:0.012273906870033135 - cluster/prob_snapshot/cluster_39:0.014508909801652986 - cluster/prob_snapshot/cluster_40:0.028360439824858034 - cluster/prob_snapshot/cluster_41:0.019264678105768717 - cluster/prob_snapshot/cluster_42:0.010326558845785191 - cluster/prob_snapshot/cluster_43:0.018773204556791854 - cluster/prob_snapshot/cluster_44:0.02164937513721171 - cluster/prob_snapshot/cluster_45:0.012616443305785206 - cluster/prob_snapshot/cluster_46:0.01741006093981829 - cluster/prob_snapshot/cluster_47:0.015833636348760432 - cluster/prob_snapshot/cluster_48:0.014508909801652986 - cluster/prob_snapshot/cluster_49:0.019618569340495995 - cluster/prob_snapshot/cluster_50:0.01565744771799514 - cluster/prob_snapshot/cluster_51:0.016760944931735643 - cluster/prob_snapshot/cluster_52:0.01297601194000008 - cluster/prob_snapshot/cluster_53:0.016760944931735643 - cluster/prob_snapshot/cluster_54:0.014508909801652986 - cluster/prob_snapshot/cluster_55:0.021967441981172207 - cluster/prob_snapshot/cluster_56:0.016760944931735643 - cluster/prob_snapshot/cluster_57:0.015833636348760432 - cluster/prob_snapshot/cluster_58:0.015833636348760432 - cluster/prob_snapshot/cluster_59:0.016760944931735643 - cluster/prob_snapshot/cluster_60:0.010723976809917424 - cluster/prob_snapshot/cluster_61:0.014906327765785219 - cluster/prob_snapshot/cluster_62:0.021194993931553854 - cluster/prob_snapshot/cluster_63:0.010975674853867837
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 307/800 [10:19:42<18:08:01, 132.42s/it]
[36m(TaskRunner pid=2823680)[0m step:307 - global_seqlen/min:379520 - global_seqlen/max:519710 - global_seqlen/minmax_diff:140190 - global_seqlen/balanced_min:464433 - global_seqlen/balanced_max:464591 - global_seqlen/mean:464495.0 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.15651493147015572) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009396381676197052 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.003679942856251728) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.000706407158077127) - actor/ppo_kl:np.float64(5.215034935505495e-05) - actor/pg_clipfrac_lower:np.float64(1.0758303214212598e-05) - actor/grad_norm:np.float64(0.3847691851357619) - perf/mfu/actor:np.float64(0.24889206271838984) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.28376197814941) - actor/lr:np.float64(1e-06) - training/global_step:307 - training/epoch:0 - critic/score/mean:0.609375 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6242156028747559 - critic/rewards/max:1.2211346626281738 - critic/rewards/min:-0.05990448221564293 - critic/advantages/mean:-0.06634658575057983 - critic/advantages/max:2.474799871444702 - critic/advantages/min:-2.47483491897583 - critic/returns/mean:-0.06634658575057983 - critic/returns/max:2.474799871444702 - critic/returns/min:-2.47483491897583 - response_length/mean:1414.23046875 - response_length/max:8192.0 - response_length/min:220.0 - response_length/clip_ratio:0.0403645820915699 - response_length_non_aborted/mean:1414.23046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:220.0 - response_length_non_aborted/clip_ratio:0.0403645820915699 - response/aborted_ratio:0.0 - prompt_length/mean:241.6979217529297 - prompt_length/max:393.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.231308311223984e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6573260510340333) - timing_s/agent_loop/generate_sequences/max:np.float64(34.851700206287205) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.240671595402091) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.851700206287205) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:203 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.971445037052035 - timing_s/reward:0.00012282002717256546 - timing_s/old_log_prob:11.777153524570167 - timing_s/ref:24.708654620684683 - timing_s/adv:0.08170278370380402 - timing_s/update_actor:22.38715716637671 - timing_s/update_weights:31.470043604262173 - timing_s/step:127.7770048128441 - timing_s/stop_profile:5.141552537679672e-05 - timing_per_token_ms/adv:6.42442232916329e-05 - timing_per_token_ms/update_actor:0.017603384593059115 - timing_per_token_ms/gen:0.03403964449623575 - timing_per_token_ms/ref:0.019428815674651198 - perf/total_num_tokens:1857980 - perf/time_per_step:127.7770048128441 - perf/throughput:3635.200251252948 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:385.0 - frontier/mean_score:2.559971376190476 - frontier/mean_frontier_pct:0.1652762066737011 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.51 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:32.0 - frontier/cluster_3/score:3.4738999999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.8319299999999994 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:48.0 - frontier/cluster_6/score:2.5379299999999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.5509999999999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8168036999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:2.6569999999999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.0569999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:16.0 - frontier/cluster_17/score:2.6569999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.6569999999999996 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.462350999999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.5509999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:3.0538999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.6878699999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:2.51 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:2.9398999999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:1.7 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:1.9456999999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.4957899999999995 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:3.0538999999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.9759899999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:3.11 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:2.4820699999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:16.0 - frontier/cluster_51/score:2.6569999999999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.339899999999999 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.6569999999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.51 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.6569999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:3.3598999999999997 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:48.0 - frontier/cluster_63/score:1.7398999999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:307.0 - cluster/prob_snapshot/cluster_0:0.01907883435554311 - cluster/prob_snapshot/cluster_1:0.015563170046283136 - cluster/prob_snapshot/cluster_2:0.008327226044684563 - cluster/prob_snapshot/cluster_3:0.021539799372025093 - cluster/prob_snapshot/cluster_4:0.01755928611520741 - cluster/prob_snapshot/cluster_5:0.01383900642286881 - cluster/prob_snapshot/cluster_6:0.015736349065961497 - cluster/prob_snapshot/cluster_7:0.022017855312490606 - cluster/prob_snapshot/cluster_8:0.01746549600402371 - cluster/prob_snapshot/cluster_9:0.016474638570906093 - cluster/prob_snapshot/cluster_10:0.01798135184630323 - cluster/prob_snapshot/cluster_11:0.014261072153964627 - cluster/prob_snapshot/cluster_12:0.012400932307795329 - cluster/prob_snapshot/cluster_13:0.012400932307795329 - cluster/prob_snapshot/cluster_14:0.016474638570906093 - cluster/prob_snapshot/cluster_15:0.015563170046283136 - cluster/prob_snapshot/cluster_16:0.012754358878567493 - cluster/prob_snapshot/cluster_17:0.016474638570906093 - cluster/prob_snapshot/cluster_18:0.016474638570906093 - cluster/prob_snapshot/cluster_19:0.014261072153964627 - cluster/prob_snapshot/cluster_20:0.016474638570906093 - cluster/prob_snapshot/cluster_21:0.014261072153964627 - cluster/prob_snapshot/cluster_22:0.012047505737023161 - cluster/prob_snapshot/cluster_23:0.015267724034516063 - cluster/prob_snapshot/cluster_24:0.022017855312490606 - cluster/prob_snapshot/cluster_25:0.012047505737023161 - cluster/prob_snapshot/cluster_26:0.011842890353944539 - cluster/prob_snapshot/cluster_27:0.018935603587388074 - cluster/prob_snapshot/cluster_28:0.01755928611520741 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016666046961076913 - cluster/prob_snapshot/cluster_31:0.016474638570906093 - cluster/prob_snapshot/cluster_32:0.015563170046283136 - cluster/prob_snapshot/cluster_33:0.014261072153964627 - cluster/prob_snapshot/cluster_34:0.014261072153964627 - cluster/prob_snapshot/cluster_35:0.018228750445843742 - cluster/prob_snapshot/cluster_36:0.015563170046283136 - cluster/prob_snapshot/cluster_37:0.01054079246162603 - cluster/prob_snapshot/cluster_38:0.012064246995638683 - cluster/prob_snapshot/cluster_39:0.014261072153964627 - cluster/prob_snapshot/cluster_40:0.02787599373003158 - cluster/prob_snapshot/cluster_41:0.018935603587388074 - cluster/prob_snapshot/cluster_42:0.010150163093930476 - cluster/prob_snapshot/cluster_43:0.018452525269337907 - cluster/prob_snapshot/cluster_44:0.020476115603790104 - cluster/prob_snapshot/cluster_45:0.012400932307795329 - cluster/prob_snapshot/cluster_46:0.01711266653814216 - cluster/prob_snapshot/cluster_47:0.015563170046283136 - cluster/prob_snapshot/cluster_48:0.014261072153964627 - cluster/prob_snapshot/cluster_49:0.019283449738621738 - cluster/prob_snapshot/cluster_50:0.015389991026604773 - cluster/prob_snapshot/cluster_51:0.016474638570906093 - cluster/prob_snapshot/cluster_52:0.014508470753505141 - cluster/prob_snapshot/cluster_53:0.016474638570906093 - cluster/prob_snapshot/cluster_54:0.014261072153964627 - cluster/prob_snapshot/cluster_55:0.021592199511491682 - cluster/prob_snapshot/cluster_56:0.016474638570906093 - cluster/prob_snapshot/cluster_57:0.015563170046283136 - cluster/prob_snapshot/cluster_58:0.016474638570906093 - cluster/prob_snapshot/cluster_59:0.016474638570906093 - cluster/prob_snapshot/cluster_60:0.01054079246162603 - cluster/prob_snapshot/cluster_61:0.014651701521660178 - cluster/prob_snapshot/cluster_62:0.02083294623048076 - cluster/prob_snapshot/cluster_63:0.010788191061166543
[36m(TaskRunner pid=2823680)[0m Training Progress:  38%|███▊      | 308/800 [10:21:49<17:52:34, 130.80s/it]
[36m(TaskRunner pid=2823680)[0m step:308 - global_seqlen/min:426030 - global_seqlen/max:467274 - global_seqlen/minmax_diff:41244 - global_seqlen/balanced_min:447217 - global_seqlen/balanced_max:447478 - global_seqlen/mean:447336.0 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.16614880874515214) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009951399639248848 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.032537645514821634) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011079816782086631) - actor/ppo_kl:np.float64(0.0004902408970239054) - actor/pg_clipfrac_lower:np.float64(3.075409179647513e-05) - actor/grad_norm:np.float64(0.4111509534219901) - perf/mfu/actor:np.float64(0.23776835634477878) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.46803665161133) - actor/lr:np.float64(1e-06) - training/global_step:308 - training/epoch:0 - critic/score/mean:0.6515957713127136 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6661834120750427 - critic/rewards/max:1.7429159879684448 - critic/rewards/min:-0.056668031960725784 - critic/advantages/mean:-0.06761547178030014 - critic/advantages/max:2.4747304916381836 - critic/advantages/min:-2.4747095108032227 - critic/returns/mean:-0.06761547178030014 - critic/returns/max:2.4747304916381836 - critic/returns/min:-2.4747095108032227 - response_length/mean:1429.7899169921875 - response_length/max:8192.0 - response_length/min:106.0 - response_length/clip_ratio:0.04388297721743584 - response_length_non_aborted/mean:1429.7899169921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:106.0 - response_length_non_aborted/clip_ratio:0.04388297721743584 - response/aborted_ratio:0.0 - prompt_length/mean:236.08511352539062 - prompt_length/max:502.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.935256093740463e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.925709892064333) - timing_s/agent_loop/generate_sequences/max:np.float64(34.59212465584278) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.022731125774953) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.59212465584278) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.478578387759626 - timing_s/reward:0.00021638907492160797 - timing_s/old_log_prob:11.43122834712267 - timing_s/ref:24.655248501338065 - timing_s/adv:0.0656428374350071 - timing_s/update_actor:22.54571117926389 - timing_s/update_weights:31.23160408809781 - timing_s/step:126.80937886983156 - timing_s/stop_profile:6.472226232290268e-05 - timing_per_token_ms/adv:5.2399494096137496e-05 - timing_per_token_ms/update_actor:0.017997147990452825 - timing_per_token_ms/gen:0.03392718613596294 - timing_per_token_ms/ref:0.019681089342973603 - perf/total_num_tokens:1789344 - perf/time_per_step:126.80937886983156 - perf/throughput:3527.6255115103554 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:419.0 - frontier/mean_score:2.580456042857143 - frontier/mean_frontier_pct:0.1754628648009134 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.6569999999999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.9317299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.8319299999999994 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.0765509999999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.5509999999999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:48.0 - frontier/cluster_8/score:2.8168036999999995 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:2.6569999999999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.0569999999999995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.7598999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:32.0 - frontier/cluster_18/score:3.3598999999999997 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.3 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:48.0 - frontier/cluster_23/score:2.462350999999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.5509999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:3.0538999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.6878699999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:2.0569999999999995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:2.9398999999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.2619899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:80.0 - frontier/cluster_40/score:4.4957899999999995 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:3.0377299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.9831929999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:3.0769999999999995 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:2.4820699999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:32.0 - frontier/cluster_51/score:2.7598999999999996 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.339899999999999 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:2.6569999999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:16.0 - frontier/cluster_57/score:2.6569999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.6569999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:32.0 - frontier/cluster_62/score:3.2519299999999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.5179299999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:308.0 - cluster/prob_snapshot/cluster_0:0.018927379126052312 - cluster/prob_snapshot/cluster_1:0.016343856463412736 - cluster/prob_snapshot/cluster_2:0.008261121276011783 - cluster/prob_snapshot/cluster_3:0.02418503228185689 - cluster/prob_snapshot/cluster_4:0.017419893652402117 - cluster/prob_snapshot/cluster_5:0.01372914699148844 - cluster/prob_snapshot/cluster_6:0.012773372782444928 - cluster/prob_snapshot/cluster_7:0.02184306898817412 - cluster/prob_snapshot/cluster_8:0.017326848083707155 - cluster/prob_snapshot/cluster_9:0.016343856463412736 - cluster/prob_snapshot/cluster_10:0.017838608861082777 - cluster/prob_snapshot/cluster_11:0.014147862200169099 - cluster/prob_snapshot/cluster_12:0.01230248886971226 - cluster/prob_snapshot/cluster_13:0.01230248886971226 - cluster/prob_snapshot/cluster_14:0.016343856463412736 - cluster/prob_snapshot/cluster_15:0.015439623531488887 - cluster/prob_snapshot/cluster_16:0.012653109802499058 - cluster/prob_snapshot/cluster_17:0.016976819515759432 - cluster/prob_snapshot/cluster_18:0.020667566176673112 - cluster/prob_snapshot/cluster_19:0.014147862200169099 - cluster/prob_snapshot/cluster_20:0.016343856463412736 - cluster/prob_snapshot/cluster_21:0.014147862200169099 - cluster/prob_snapshot/cluster_22:0.01195186793692546 - cluster/prob_snapshot/cluster_23:0.015146522885412423 - cluster/prob_snapshot/cluster_24:0.02184306898817412 - cluster/prob_snapshot/cluster_25:0.01195186793692546 - cluster/prob_snapshot/cluster_26:0.01174887687057521 - cluster/prob_snapshot/cluster_27:0.018785285379607136 - cluster/prob_snapshot/cluster_28:0.017419893652402117 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016533745379116744 - cluster/prob_snapshot/cluster_31:0.016343856463412736 - cluster/prob_snapshot/cluster_32:0.012653109802499058 - cluster/prob_snapshot/cluster_33:0.014147862200169099 - cluster/prob_snapshot/cluster_34:0.014147862200169099 - cluster/prob_snapshot/cluster_35:0.018084043514033536 - cluster/prob_snapshot/cluster_36:0.015439623531488887 - cluster/prob_snapshot/cluster_37:0.012856100868849312 - cluster/prob_snapshot/cluster_38:0.013914053399200215 - cluster/prob_snapshot/cluster_39:0.014147862200169099 - cluster/prob_snapshot/cluster_40:0.02765470321778184 - cluster/prob_snapshot/cluster_41:0.01868581975709551 - cluster/prob_snapshot/cluster_42:0.010069587139859486 - cluster/prob_snapshot/cluster_43:0.01835034933935176 - cluster/prob_snapshot/cluster_44:0.020313568210691572 - cluster/prob_snapshot/cluster_45:0.01230248886971226 - cluster/prob_snapshot/cluster_46:0.016976819515759432 - cluster/prob_snapshot/cluster_47:0.015439623531488887 - cluster/prob_snapshot/cluster_48:0.014147862200169099 - cluster/prob_snapshot/cluster_49:0.018927379126052312 - cluster/prob_snapshot/cluster_50:0.015267819274423352 - cluster/prob_snapshot/cluster_51:0.016976819515759432 - cluster/prob_snapshot/cluster_52:0.014393296853119854 - cluster/prob_snapshot/cluster_53:0.016343856463412736 - cluster/prob_snapshot/cluster_54:0.015439623531488887 - cluster/prob_snapshot/cluster_55:0.02142079220896568 - cluster/prob_snapshot/cluster_56:0.016343856463412736 - cluster/prob_snapshot/cluster_57:0.016343856463412736 - cluster/prob_snapshot/cluster_58:0.016343856463412736 - cluster/prob_snapshot/cluster_59:0.016343856463412736 - cluster/prob_snapshot/cluster_60:0.010457115539255421 - cluster/prob_snapshot/cluster_61:0.014535390599565034 - cluster/prob_snapshot/cluster_62:0.020003416315041694 - cluster/prob_snapshot/cluster_63:0.009337158465001164
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 21:54:15,017:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  39%|███▊      | 309/800 [10:23:47<17:18:54, 126.95s/it]
[36m(TaskRunner pid=2823680)[0m step:309 - global_seqlen/min:355461 - global_seqlen/max:507034 - global_seqlen/minmax_diff:151573 - global_seqlen/balanced_min:411918 - global_seqlen/balanced_max:412178 - global_seqlen/mean:412035.5 - frontier/skipped_zero_acc_count:20.0 - actor/entropy:np.float64(0.1456408404779655) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010422728955745697 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.058680386522610206) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008884696271033372) - actor/ppo_kl:np.float64(7.054291365896595e-05) - actor/pg_clipfrac_lower:np.float64(9.947085278259624e-06) - actor/grad_norm:np.float64(0.45213656553200315) - perf/mfu/actor:np.float64(0.16698210912149838) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.37315368652344) - actor/lr:np.float64(1e-06) - training/global_step:309 - training/epoch:0 - critic/score/mean:0.6851851940155029 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6992512345314026 - critic/rewards/max:1.5024054050445557 - critic/rewards/min:-0.06499674171209335 - critic/advantages/mean:-0.05550117418169975 - critic/advantages/max:2.4748377799987793 - critic/advantages/min:-2.4748263359069824 - critic/returns/mean:-0.05550117418169975 - critic/returns/max:2.4748377799987793 - critic/returns/min:-2.4748263359069824 - response_length/mean:1294.914306640625 - response_length/max:8192.0 - response_length/min:164.0 - response_length/clip_ratio:0.03703703731298447 - response_length_non_aborted/mean:1294.914306640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:164.0 - response_length_non_aborted/clip_ratio:0.03703703731298447 - response/aborted_ratio:0.0 - prompt_length/mean:225.47222900390625 - prompt_length/max:393.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.544473141431808e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3256396688520908) - timing_s/agent_loop/generate_sequences/max:np.float64(34.98388358298689) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.323793556074634) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.98388358298689) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.52193014789373 - timing_s/reward:0.00012631528079509735 - timing_s/old_log_prob:11.914813091047108 - timing_s/ref:14.829663719981909 - timing_s/adv:0.07269431930035353 - timing_s/update_actor:29.271145842038095 - timing_s/update_weights:24.70611187722534 - timing_s/step:117.72329095005989 - timing_s/stop_profile:5.136243999004364e-05 - timing_per_token_ms/adv:5.5339178252023446e-05 - timing_per_token_ms/update_actor:0.022282912516186713 - timing_per_token_ms/gen:0.032643666683852005 - timing_per_token_ms/ref:0.011289209554695602 - perf/total_num_tokens:1648142 - perf/time_per_step:117.72329095005989 - perf/throughput:3500.0338223197655 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:439.0 - frontier/mean_score:2.6136824966666663 - frontier/mean_frontier_pct:0.19438622244188797 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.6569999999999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.9317299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.8319299999999994 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.0765509999999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.5509999999999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:2.8717625899999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:2.6569999999999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:32.0 - frontier/cluster_16/score:2.339899999999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:32.0 - frontier/cluster_17/score:2.7598999999999996 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.8519299999999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:16.0 - frontier/cluster_20/score:2.6569999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.91 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.623645699999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.5509999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:3.0538999999999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:32.0 - frontier/cluster_28/score:2.8319299999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:2.0569999999999995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.51 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:48.0 - frontier/cluster_35/score:2.9579299999999997 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:48.0 - frontier/cluster_38/score:2.2619899999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.647053 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:32.0 - frontier/cluster_41/score:3.0377299999999994 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.9831929999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:3.6538999999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:2.4820699999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.339899999999999 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:2.7598999999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:16.0 - frontier/cluster_59/score:2.6569999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:32.0 - frontier/cluster_61/score:2.3629999999999995 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:3.1763509999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.5179299999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:309.0 - cluster/prob_snapshot/cluster_0:0.01868676470977598 - cluster/prob_snapshot/cluster_1:0.016136085093881956 - cluster/prob_snapshot/cluster_2:0.008156101724156367 - cluster/prob_snapshot/cluster_3:0.02387757991952145 - cluster/prob_snapshot/cluster_4:0.017198443153901816 - cluster/prob_snapshot/cluster_5:0.013554615131196065 - cluster/prob_snapshot/cluster_6:0.012610991207296071 - cluster/prob_snapshot/cluster_7:0.021565388847713522 - cluster/prob_snapshot/cluster_8:0.017440348333333397 - cluster/prob_snapshot/cluster_9:0.016136085093881956 - cluster/prob_snapshot/cluster_10:0.02143785486691882 - cluster/prob_snapshot/cluster_11:0.013968007420372035 - cluster/prob_snapshot/cluster_12:0.012146093409019162 - cluster/prob_snapshot/cluster_13:0.012146093409019162 - cluster/prob_snapshot/cluster_14:0.016136085093881956 - cluster/prob_snapshot/cluster_15:0.015243347228319047 - cluster/prob_snapshot/cluster_16:0.014210321983881963 - cluster/prob_snapshot/cluster_17:0.01676100159977599 - cluster/prob_snapshot/cluster_18:0.023392950792501588 - cluster/prob_snapshot/cluster_19:0.013968007420372035 - cluster/prob_snapshot/cluster_20:0.016136085093881956 - cluster/prob_snapshot/cluster_21:0.0115995192056133 - cluster/prob_snapshot/cluster_22:0.011799929746862114 - cluster/prob_snapshot/cluster_23:0.01593352287218573 - cluster/prob_snapshot/cluster_24:0.021565388847713522 - cluster/prob_snapshot/cluster_25:0.011799929746862114 - cluster/prob_snapshot/cluster_26:0.0115995192056133 - cluster/prob_snapshot/cluster_27:0.018546477330901808 - cluster/prob_snapshot/cluster_28:0.017198443153901816 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016892234066013735 - cluster/prob_snapshot/cluster_31:0.016136085093881956 - cluster/prob_snapshot/cluster_32:0.012492257071176205 - cluster/prob_snapshot/cluster_33:0.015243347228319047 - cluster/prob_snapshot/cluster_34:0.013968007420372035 - cluster/prob_snapshot/cluster_35:0.017963647038670025 - cluster/prob_snapshot/cluster_36:0.015243347228319047 - cluster/prob_snapshot/cluster_37:0.012692667612425023 - cluster/prob_snapshot/cluster_38:0.013737170915133625 - cluster/prob_snapshot/cluster_39:0.015243347228319047 - cluster/prob_snapshot/cluster_40:0.02822176990733136 - cluster/prob_snapshot/cluster_41:0.018448276165689885 - cluster/prob_snapshot/cluster_42:0.009941577455282184 - cluster/prob_snapshot/cluster_43:0.018117070417566047 - cluster/prob_snapshot/cluster_44:0.020055331857683916 - cluster/prob_snapshot/cluster_45:0.012146093409019162 - cluster/prob_snapshot/cluster_46:0.01676100159977599 - cluster/prob_snapshot/cluster_47:0.015243347228319047 - cluster/prob_snapshot/cluster_48:0.013968007420372035 - cluster/prob_snapshot/cluster_49:0.022190305353607557 - cluster/prob_snapshot/cluster_50:0.015073727033862092 - cluster/prob_snapshot/cluster_51:0.013554615131196065 - cluster/prob_snapshot/cluster_52:0.014210321983881963 - cluster/prob_snapshot/cluster_53:0.01676100159977599 - cluster/prob_snapshot/cluster_54:0.015243347228319047 - cluster/prob_snapshot/cluster_55:0.02114848026449564 - cluster/prob_snapshot/cluster_56:0.016136085093881956 - cluster/prob_snapshot/cluster_57:0.01676100159977599 - cluster/prob_snapshot/cluster_58:0.016136085093881956 - cluster/prob_snapshot/cluster_59:0.016136085093881956 - cluster/prob_snapshot/cluster_60:0.010324179397666288 - cluster/prob_snapshot/cluster_61:0.014350609362756138 - cluster/prob_snapshot/cluster_62:0.01929012797291571 - cluster/prob_snapshot/cluster_63:0.009218459784176226
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 21:56:11,515:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  39%|███▉      | 310/800 [10:25:43<16:49:21, 123.59s/it]
[36m(TaskRunner pid=2823680)[0m step:310 - global_seqlen/min:330168 - global_seqlen/max:499108 - global_seqlen/minmax_diff:168940 - global_seqlen/balanced_min:396619 - global_seqlen/balanced_max:396777 - global_seqlen/mean:396691.5 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.18899914583501717) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013326829299330711 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.03186630159325432) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0010784517323069547) - actor/ppo_kl:np.float64(0.00016052349200516383) - actor/pg_clipfrac_lower:np.float64(9.47195421000894e-06) - actor/grad_norm:np.float64(0.43302344034115475) - perf/mfu/actor:np.float64(0.16100711549570984) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(103.83749008178711) - actor/lr:np.float64(1e-06) - training/global_step:310 - training/epoch:0 - critic/score/mean:0.6460526585578918 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6573045253753662 - critic/rewards/max:1.2185776233673096 - critic/rewards/min:-0.08470703661441803 - critic/advantages/mean:-0.07924002408981323 - critic/advantages/max:2.4746570587158203 - critic/advantages/min:-2.474816083908081 - critic/returns/mean:-0.07924002408981323 - critic/returns/max:2.4746570587158203 - critic/returns/min:-2.474816083908081 - response_length/mean:1229.9617919921875 - response_length/max:8192.0 - response_length/min:219.0 - response_length/clip_ratio:0.03684210404753685 - response_length_non_aborted/mean:1229.9617919921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:219.0 - response_length_non_aborted/clip_ratio:0.03684210404753685 - response/aborted_ratio:0.0 - prompt_length/mean:239.43157958984375 - prompt_length/max:404.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.787121623754501e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2163889165967703) - timing_s/agent_loop/generate_sequences/max:np.float64(31.21455918159336) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.7893012886115685) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(31.21455918159336) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:211 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:33.15897359512746 - timing_s/reward:0.0001241583377122879 - timing_s/old_log_prob:10.956537912599742 - timing_s/ref:15.473232314921916 - timing_s/adv:0.0803873548284173 - timing_s/update_actor:29.32205630838871 - timing_s/update_weights:25.7706998558715 - timing_s/step:115.1221309825778 - timing_s/stop_profile:5.295872688293457e-05 - timing_per_token_ms/adv:7.198401312071782e-05 - timing_per_token_ms/update_actor:0.026256857070800527 - timing_per_token_ms/gen:0.03547283088064078 - timing_per_token_ms/ref:0.013855728433342004 - perf/total_num_tokens:1586766 - perf/time_per_step:115.1221309825778 - perf/throughput:3445.8318015328778 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:471.0 - frontier/mean_score:2.613384334761905 - frontier/mean_frontier_pct:0.21515750138196765 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:2.6569999999999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.9317299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:32.0 - frontier/cluster_4/score:2.8319299999999994 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.0765509999999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.5509999999999997 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:2.8717625899999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:2.7598999999999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.5379299999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:2.2319299999999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:48.0 - frontier/cluster_18/score:3.5963509999999994 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.7598999999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.91 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.623645699999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.5509999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:32.0 - frontier/cluster_27/score:3.0377299999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:16.0 - frontier/cluster_31/score:2.6569999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:2.0569999999999995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.51 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.5705509999999996 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.51 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.4833929999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.152937099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.6264109999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:48.0 - frontier/cluster_43/score:2.9831929999999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.6569999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:3.6538999999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:48.0 - frontier/cluster_50/score:2.4820699999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.339899999999999 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:32.0 - frontier/cluster_53/score:2.7598999999999996 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.7598999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:3.1234456999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.5179299999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:310.0 - cluster/prob_snapshot/cluster_0:0.018688896689097043 - cluster/prob_snapshot/cluster_1:0.01613792606530089 - cluster/prob_snapshot/cluster_2:0.008157032256567219 - cluster/prob_snapshot/cluster_3:0.023880304120709624 - cluster/prob_snapshot/cluster_4:0.017200405330111985 - cluster/prob_snapshot/cluster_5:0.013556161581831769 - cluster/prob_snapshot/cluster_6:0.012612429999558384 - cluster/prob_snapshot/cluster_7:0.021567849250238413 - cluster/prob_snapshot/cluster_8:0.017442338108587502 - cluster/prob_snapshot/cluster_9:0.016762913868130946 - cluster/prob_snapshot/cluster_10:0.021440300719048608 - cluster/prob_snapshot/cluster_11:0.013969601035074162 - cluster/prob_snapshot/cluster_12:0.012147479160934055 - cluster/prob_snapshot/cluster_13:0.012147479160934055 - cluster/prob_snapshot/cluster_14:0.01613792606530089 - cluster/prob_snapshot/cluster_15:0.015245086346972239 - cluster/prob_snapshot/cluster_16:0.01541472589345468 - cluster/prob_snapshot/cluster_17:0.013556161581831769 - cluster/prob_snapshot/cluster_18:0.02184329941395217 - cluster/prob_snapshot/cluster_19:0.015245086346972239 - cluster/prob_snapshot/cluster_20:0.016762913868130946 - cluster/prob_snapshot/cluster_21:0.011600842598692023 - cluster/prob_snapshot/cluster_22:0.011801276004847434 - cluster/prob_snapshot/cluster_23:0.015935340733212115 - cluster/prob_snapshot/cluster_24:0.021567849250238413 - cluster/prob_snapshot/cluster_25:0.011801276004847434 - cluster/prob_snapshot/cluster_26:0.011600842598692023 - cluster/prob_snapshot/cluster_27:0.0184503809357721 - cluster/prob_snapshot/cluster_28:0.013862405605218498 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016894161306725258 - cluster/prob_snapshot/cluster_31:0.01613792606530089 - cluster/prob_snapshot/cluster_32:0.012493682317020673 - cluster/prob_snapshot/cluster_33:0.015245086346972239 - cluster/prob_snapshot/cluster_34:0.013969601035074162 - cluster/prob_snapshot/cluster_35:0.021686596932776123 - cluster/prob_snapshot/cluster_36:0.015245086346972239 - cluster/prob_snapshot/cluster_37:0.012694115723176087 - cluster/prob_snapshot/cluster_38:0.015083482357954751 - cluster/prob_snapshot/cluster_39:0.015245086346972239 - cluster/prob_snapshot/cluster_40:0.02522385843945995 - cluster/prob_snapshot/cluster_41:0.022025876025741013 - cluster/prob_snapshot/cluster_42:0.009942711693224525 - cluster/prob_snapshot/cluster_43:0.01811913740027217 - cluster/prob_snapshot/cluster_44:0.020057619977294865 - cluster/prob_snapshot/cluster_45:0.012147479160934055 - cluster/prob_snapshot/cluster_46:0.016762913868130946 - cluster/prob_snapshot/cluster_47:0.01613792606530089 - cluster/prob_snapshot/cluster_48:0.013969601035074162 - cluster/prob_snapshot/cluster_49:0.02219283705306847 - cluster/prob_snapshot/cluster_50:0.015075446800489792 - cluster/prob_snapshot/cluster_51:0.013556161581831769 - cluster/prob_snapshot/cluster_52:0.014211943244334793 - cluster/prob_snapshot/cluster_53:0.016762913868130946 - cluster/prob_snapshot/cluster_54:0.015245086346972239 - cluster/prob_snapshot/cluster_55:0.021150893101778933 - cluster/prob_snapshot/cluster_56:0.01613792606530089 - cluster/prob_snapshot/cluster_57:0.016762913868130946 - cluster/prob_snapshot/cluster_58:0.01613792606530089 - cluster/prob_snapshot/cluster_59:0.016762913868130946 - cluster/prob_snapshot/cluster_60:0.010325357286793947 - cluster/prob_snapshot/cluster_61:0.011868694514190618 - cluster/prob_snapshot/cluster_62:0.01897099577552954 - cluster/prob_snapshot/cluster_63:0.009219511521378313
[36m(TaskRunner pid=2823680)[0m Training Progress:  39%|███▉      | 311/800 [10:27:46<16:47:24, 123.61s/it]
[36m(TaskRunner pid=2823680)[0m step:311 - global_seqlen/min:376923 - global_seqlen/max:518957 - global_seqlen/minmax_diff:142034 - global_seqlen/balanced_min:457758 - global_seqlen/balanced_max:457883 - global_seqlen/mean:457807.25 - frontier/skipped_zero_acc_count:35.0 - actor/entropy:np.float64(0.16484238308398647) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00935362745076418 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.08006111023132689) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00095186632221312) - actor/ppo_kl:np.float64(0.0011879479790691415) - actor/pg_clipfrac_lower:np.float64(1.9764983973916658e-05) - actor/grad_norm:np.float64(0.38236135865251225) - perf/mfu/actor:np.float64(0.24723136096667908) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.29732894897461) - actor/lr:np.float64(1e-06) - training/global_step:311 - training/epoch:0 - critic/score/mean:0.6411290168762207 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6599138975143433 - critic/rewards/max:1.4531898498535156 - critic/rewards/min:-0.4605543613433838 - critic/advantages/mean:-0.0653686448931694 - critic/advantages/max:2.4746439456939697 - critic/advantages/min:-2.474820137023926 - critic/returns/mean:-0.0653686448931694 - critic/returns/max:2.4746439456939697 - critic/returns/min:-2.474820137023926 - response_length/mean:1350.81591796875 - response_length/max:8192.0 - response_length/min:79.0 - response_length/clip_ratio:0.04032257944345474 - response_length_non_aborted/mean:1350.81591796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:79.0 - response_length_non_aborted/clip_ratio:0.04032257944345474 - response/aborted_ratio:0.0 - prompt_length/mean:234.90322875976562 - prompt_length/max:378.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.093938231468201e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.7208566311746836) - timing_s/agent_loop/generate_sequences/max:np.float64(33.81219065375626) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.038187456295418) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.81219065375626) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:205 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.362195443362 - timing_s/reward:0.00011803489178419113 - timing_s/old_log_prob:11.023440852761269 - timing_s/ref:22.871998090296984 - timing_s/adv:0.0749894492328167 - timing_s/update_actor:22.289161710999906 - timing_s/update_weights:30.325229234062135 - timing_s/step:123.3521721502766 - timing_s/stop_profile:7.510650902986526e-05 - timing_per_token_ms/adv:6.356250067412575e-05 - timing_per_token_ms/update_actor:0.018892722519971948 - timing_per_token_ms/gen:0.036181036991147324 - timing_per_token_ms/ref:0.019386745854334076 - perf/total_num_tokens:1831229 - perf/time_per_step:123.3521721502766 - perf/throughput:3711.383772328434 - frontier/active_count:63.0 - frontier/completed_count:1.0 - frontier/blacklisted_count:506.0 - frontier/mean_score:2.5957227476666658 - frontier/mean_frontier_pct:0.22873675838026256 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.1598999999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:48.0 - frontier/cluster_2/score:1.343 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.9317299999999995 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:48.0 - frontier/cluster_4/score:3.4823509999999995 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.0765509999999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:32.0 - frontier/cluster_7/score:3.3856999999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:64.0 - frontier/cluster_8/score:2.9102338129999996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:2.8319299999999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:0.0 - frontier/cluster_13/score:2.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:16.0 - frontier/cluster_14/score:2.6569999999999996 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:48.0 - frontier/cluster_16/score:2.5379299999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:48.0 - frontier/cluster_17/score:2.2319299999999993 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.417445699999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8319299999999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:1.91 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.736551989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.5509999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.4264109999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:2.7598999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:2.339899999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.51 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.3993856999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:16.0 - frontier/cluster_36/score:2.6569999999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.09 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.4833929999999995 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.152937099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.6264109999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9882350999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:48.0 - frontier/cluster_44/score:3.3023509999999994 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.6569999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:3.6538999999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.339899999999999 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.2319299999999993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.7598999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:3.1234456999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.5179299999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:311.0 - cluster/prob_snapshot/cluster_0:0.01881605802668023 - cluster/prob_snapshot/cluster_1:0.01320793101456829 - cluster/prob_snapshot/cluster_2:0.008212533613854909 - cluster/prob_snapshot/cluster_3:0.02404278837349349 - cluster/prob_snapshot/cluster_4:0.021294806137558638 - cluster/prob_snapshot/cluster_5:0.013648399217253299 - cluster/prob_snapshot/cluster_6:0.012698246380032778 - cluster/prob_snapshot/cluster_7:0.020703778895330274 - cluster/prob_snapshot/cluster_8:0.017796271789605092 - cluster/prob_snapshot/cluster_9:0.017317438806466214 - cluster/prob_snapshot/cluster_10:0.021586182916535985 - cluster/prob_snapshot/cluster_11:0.014064651758649507 - cluster/prob_snapshot/cluster_12:0.01223013196404305 - cluster/prob_snapshot/cluster_13:0.01223013196404305 - cluster/prob_snapshot/cluster_14:0.01624773031423119 - cluster/prob_snapshot/cluster_15:0.015348815614874028 - cluster/prob_snapshot/cluster_16:0.015519609407751886 - cluster/prob_snapshot/cluster_17:0.013648399217253299 - cluster/prob_snapshot/cluster_18:0.020897905945475734 - cluster/prob_snapshot/cluster_19:0.015348815614874028 - cluster/prob_snapshot/cluster_20:0.017317438806466214 - cluster/prob_snapshot/cluster_21:0.011679776025661113 - cluster/prob_snapshot/cluster_22:0.011881573203067822 - cluster/prob_snapshot/cluster_23:0.0167341959820823 - cluster/prob_snapshot/cluster_24:0.021714599302158435 - cluster/prob_snapshot/cluster_25:0.011881573203067822 - cluster/prob_snapshot/cluster_26:0.011679776025661113 - cluster/prob_snapshot/cluster_27:0.014837663364502828 - cluster/prob_snapshot/cluster_28:0.013956726959132807 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017009111064586708 - cluster/prob_snapshot/cluster_31:0.016876970603781207 - cluster/prob_snapshot/cluster_32:0.014308642891332162 - cluster/prob_snapshot/cluster_33:0.015348815614874028 - cluster/prob_snapshot/cluster_34:0.014064651758649507 - cluster/prob_snapshot/cluster_35:0.020787467853840426 - cluster/prob_snapshot/cluster_36:0.01624773031423119 - cluster/prob_snapshot/cluster_37:0.012780487902424987 - cluster/prob_snapshot/cluster_38:0.01518611205429038 - cluster/prob_snapshot/cluster_39:0.015348815614874028 - cluster/prob_snapshot/cluster_40:0.025395484385685122 - cluster/prob_snapshot/cluster_41:0.02217574254292866 - cluster/prob_snapshot/cluster_42:0.010010363012569237 - cluster/prob_snapshot/cluster_43:0.018273254806292688 - cluster/prob_snapshot/cluster_44:0.020194094260794763 - cluster/prob_snapshot/cluster_45:0.01223013196404305 - cluster/prob_snapshot/cluster_46:0.016876970603781207 - cluster/prob_snapshot/cluster_47:0.01624773031423119 - cluster/prob_snapshot/cluster_48:0.014064651758649507 - cluster/prob_snapshot/cluster_49:0.02234383959170845 - cluster/prob_snapshot/cluster_50:0.01245913507000377 - cluster/prob_snapshot/cluster_51:0.013648399217253299 - cluster/prob_snapshot/cluster_52:0.014308642891332162 - cluster/prob_snapshot/cluster_53:0.013648399217253299 - cluster/prob_snapshot/cluster_54:0.015348815614874028 - cluster/prob_snapshot/cluster_55:0.021294806137558638 - cluster/prob_snapshot/cluster_56:0.01624773031423119 - cluster/prob_snapshot/cluster_57:0.016876970603781207 - cluster/prob_snapshot/cluster_58:0.01624773031423119 - cluster/prob_snapshot/cluster_59:0.016876970603781207 - cluster/prob_snapshot/cluster_60:0.010395612169436593 - cluster/prob_snapshot/cluster_61:0.01194945043546826 - cluster/prob_snapshot/cluster_62:0.019100076546761408 - cluster/prob_snapshot/cluster_63:0.009282242106089932
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 22:00:10,218:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  39%|███▉      | 312/800 [10:29:48<16:39:27, 122.88s/it]
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 22:00:12,526:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:312 - global_seqlen/min:399606 - global_seqlen/max:521670 - global_seqlen/minmax_diff:122064 - global_seqlen/balanced_min:456985 - global_seqlen/balanced_max:457253 - global_seqlen/mean:457100.25 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.16383929182167933) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.006131293252110481 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.0011922989797312766) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.001119878988111336) - actor/ppo_kl:np.float64(-0.001968102437128595) - actor/pg_clipfrac_lower:np.float64(0.00010904648289674104) - actor/grad_norm:np.float64(0.40444884076714516) - perf/mfu/actor:np.float64(0.252168717843576) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.18350219726562) - actor/lr:np.float64(1e-06) - training/global_step:312 - training/epoch:0 - critic/score/mean:0.6277472376823425 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.647508978843689 - critic/rewards/max:1.6683299541473389 - critic/rewards/min:-0.09275226294994354 - critic/advantages/mean:0.0012130059767514467 - critic/advantages/max:2.474652051925659 - critic/advantages/min:-2.4746932983398438 - critic/returns/mean:0.0012130059767514467 - critic/returns/max:2.474652051925659 - critic/returns/min:-2.4746932983398438 - response_length/mean:1295.8310546875 - response_length/max:8192.0 - response_length/min:214.0 - response_length/clip_ratio:0.0357142873108387 - response_length_non_aborted/mean:1295.8310546875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:214.0 - response_length_non_aborted/clip_ratio:0.0357142873108387 - response/aborted_ratio:0.0 - prompt_length/mean:241.12088012695312 - prompt_length/max:375.0 - prompt_length/min:186.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.28286463022232e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6016860185191035) - timing_s/agent_loop/generate_sequences/max:np.float64(34.60380039270967) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.084544034612918) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.60380039270967) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:197 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.13814485818148 - timing_s/reward:0.0001377509906888008 - timing_s/old_log_prob:10.783162977546453 - timing_s/ref:21.858921422623098 - timing_s/adv:0.07157304417341948 - timing_s/update_actor:21.74373967014253 - timing_s/update_weights:28.969133365899324 - timing_s/step:120.96424868609756 - timing_s/stop_profile:5.09507954120636e-05 - timing_per_token_ms/adv:6.39672716115362e-05 - timing_per_token_ms/update_actor:0.019433122027902854 - timing_per_token_ms/gen:0.039367736621754544 - timing_per_token_ms/ref:0.019536063890034146 - perf/total_num_tokens:1828401 - perf/time_per_step:120.96424868609756 - perf/throughput:3778.804522534389 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:543.0 - frontier/mean_score:2.5840338993403225 - frontier/mean_frontier_pct:0.2366583820116343 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.1598999999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:48.0 - frontier/cluster_3/score:3.6522109999999994 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.0765509999999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:3.2699899999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.9371636690999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:32.0 - frontier/cluster_9/score:2.8319299999999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:1.7 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.3598999999999997 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.0765509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.417445699999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:32.0 - frontier/cluster_20/score:2.8319299999999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.237 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.736551989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.5509999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.4264109999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:32.0 - frontier/cluster_31/score:2.7598999999999996 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:32.0 - frontier/cluster_32/score:2.339899999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.6569999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.3993856999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:2.7598999999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.3629999999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.6383750999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.152937099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.6264109999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9882350999999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:3.211645699999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.6569999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:3.6538999999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.339899999999999 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.2319299999999993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.6569999999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:48.0 - frontier/cluster_55/score:3.4823509999999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:16.0 - frontier/cluster_58/score:2.6569999999999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.7598999999999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:3.1234456999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.5179299999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:312.0 - cluster/prob_snapshot/cluster_0:0.01920602987086752 - cluster/prob_snapshot/cluster_1:0.013481671731584906 - cluster/prob_snapshot/cluster_2:0.007740460722412355 - cluster/prob_snapshot/cluster_3:0.022796383997631117 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0139312688494265 - cluster/prob_snapshot/cluster_6:0.012961423638082488 - cluster/prob_snapshot/cluster_7:0.020410635559778383 - cluster/prob_snapshot/cluster_8:0.01833319894844376 - cluster/prob_snapshot/cluster_9:0.017676351047190723 - cluster/prob_snapshot/cluster_10:0.02203356693017951 - cluster/prob_snapshot/cluster_11:0.014356148424762854 - cluster/prob_snapshot/cluster_12:0.012483607325880743 - cluster/prob_snapshot/cluster_13:0.010611066226998632 - cluster/prob_snapshot/cluster_14:0.020971836127113352 - cluster/prob_snapshot/cluster_15:0.015666927193980332 - cluster/prob_snapshot/cluster_16:0.012961423638082488 - cluster/prob_snapshot/cluster_17:0.01162442929348066 - cluster/prob_snapshot/cluster_18:0.021331025088159816 - cluster/prob_snapshot/cluster_19:0.015666927193980332 - cluster/prob_snapshot/cluster_20:0.017676351047190723 - cluster/prob_snapshot/cluster_21:0.013962914793997612 - cluster/prob_snapshot/cluster_22:0.012127824517093141 - cluster/prob_snapshot/cluster_23:0.017081020235008758 - cluster/prob_snapshot/cluster_24:0.022164644807101258 - cluster/prob_snapshot/cluster_25:0.012127824517093141 - cluster/prob_snapshot/cluster_26:0.01192184499621611 - cluster/prob_snapshot/cluster_27:0.015145181067598807 - cluster/prob_snapshot/cluster_28:0.014245986831915617 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017361633064701606 - cluster/prob_snapshot/cluster_31:0.01722675392934913 - cluster/prob_snapshot/cluster_32:0.014605196390914172 - cluster/prob_snapshot/cluster_33:0.016584472332432567 - cluster/prob_snapshot/cluster_34:0.014356148424762854 - cluster/prob_snapshot/cluster_35:0.021218298114007116 - cluster/prob_snapshot/cluster_36:0.01722675392934913 - cluster/prob_snapshot/cluster_37:0.014749382055528096 - cluster/prob_snapshot/cluster_38:0.016468219363390664 - cluster/prob_snapshot/cluster_39:0.015666927193980332 - cluster/prob_snapshot/cluster_40:0.02592181800274096 - cluster/prob_snapshot/cluster_41:0.022635345463127253 - cluster/prob_snapshot/cluster_42:0.010217832596233389 - cluster/prob_snapshot/cluster_43:0.018651976792906984 - cluster/prob_snapshot/cluster_44:0.02004646189432669 - cluster/prob_snapshot/cluster_45:0.012483607325880743 - cluster/prob_snapshot/cluster_46:0.01722675392934913 - cluster/prob_snapshot/cluster_47:0.016584472332432567 - cluster/prob_snapshot/cluster_48:0.014356148424762854 - cluster/prob_snapshot/cluster_49:0.022806926404017824 - cluster/prob_snapshot/cluster_50:0.012717356631254192 - cluster/prob_snapshot/cluster_51:0.0139312688494265 - cluster/prob_snapshot/cluster_52:0.014605196390914172 - cluster/prob_snapshot/cluster_53:0.0139312688494265 - cluster/prob_snapshot/cluster_54:0.016584472332432567 - cluster/prob_snapshot/cluster_55:0.021736151227444066 - cluster/prob_snapshot/cluster_56:0.016584472332432567 - cluster/prob_snapshot/cluster_57:0.01722675392934913 - cluster/prob_snapshot/cluster_58:0.016584472332432567 - cluster/prob_snapshot/cluster_59:0.01722675392934913 - cluster/prob_snapshot/cluster_60:0.010611066226998632 - cluster/prob_snapshot/cluster_61:0.012197108537751779 - cluster/prob_snapshot/cluster_62:0.01949593481125535 - cluster/prob_snapshot/cluster_63:0.009474621034087076
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 22:02:10,265:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  39%|███▉      | 313/800 [10:32:03<17:07:29, 126.59s/it]
[36m(TaskRunner pid=2823680)[0m step:313 - global_seqlen/min:409171 - global_seqlen/max:484402 - global_seqlen/minmax_diff:75231 - global_seqlen/balanced_min:437671 - global_seqlen/balanced_max:437834 - global_seqlen/mean:437745.75 - frontier/skipped_zero_acc_count:14.0 - actor/entropy:np.float64(0.15782797847988836) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010857452638447285 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.015168885598541237) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0013820925960955496) - actor/ppo_kl:np.float64(0.0036232742873963026) - actor/pg_clipfrac_lower:np.float64(3.0858595955738055e-05) - actor/grad_norm:np.float64(0.3997819220026334) - perf/mfu/actor:np.float64(0.20093733544223966) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(105.02309799194336) - actor/lr:np.float64(1e-06) - training/global_step:313 - training/epoch:0 - critic/score/mean:0.6304824352264404 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6445754766464233 - critic/rewards/max:1.465250015258789 - critic/rewards/min:-0.07973847538232803 - critic/advantages/mean:-0.06449843943119049 - critic/advantages/max:2.4744715690612793 - critic/advantages/min:-2.4746620655059814 - critic/returns/mean:-0.06449843943119049 - critic/returns/max:2.4744715690612793 - critic/returns/min:-2.4746620655059814 - response_length/mean:1419.3541259765625 - response_length/max:8192.0 - response_length/min:195.0 - response_length/clip_ratio:0.04495614022016525 - response_length_non_aborted/mean:1419.3541259765625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:195.0 - response_length_non_aborted/clip_ratio:0.04495614022016525 - response/aborted_ratio:0.0 - prompt_length/mean:239.03509521484375 - prompt_length/max:392.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.089522063732147e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6758415065705776) - timing_s/agent_loop/generate_sequences/max:np.float64(35.43492961674929) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.687014182390158) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.43492961674929) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:204 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.59073237515986 - timing_s/reward:0.00022440031170845032 - timing_s/old_log_prob:12.367882923223078 - timing_s/ref:26.030473835766315 - timing_s/adv:0.11358764674514532 - timing_s/update_actor:26.20995268970728 - timing_s/update_weights:32.2877648929134 - timing_s/step:135.01354094408453 - timing_s/stop_profile:6.599631160497665e-05 - timing_per_token_ms/adv:7.510170362222996e-05 - timing_per_token_ms/update_actor:0.017329455757381415 - timing_per_token_ms/gen:0.02903990369288591 - timing_per_token_ms/ref:0.017210788207860164 - perf/total_num_tokens:1750983 - perf/time_per_step:135.01354094408453 - perf/throughput:3242.235904184538 - frontier/active_count:62.0 - frontier/completed_count:2.0 - frontier/blacklisted_count:557.0 - frontier/mean_score:2.608415929501613 - frontier/mean_frontier_pct:0.2549218056349167 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.1598999999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:64.0 - frontier/cluster_2/score:1.2401 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.4565476999999993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:64.0 - frontier/cluster_6/score:2.0765509999999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:3.1889929999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.9371636690999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.8823509999999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.51 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:1.7 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:2.51 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.0765509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.417445699999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:2.8823509999999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.237 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.736551989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:32.0 - frontier/cluster_24/score:3.5509999999999997 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.4264109999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.2319299999999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.5379299999999994 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.7598999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.3993856999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:32.0 - frontier/cluster_36/score:2.8319299999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:32.0 - frontier/cluster_37/score:2.3629999999999995 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.6383750999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.152937099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.6264109999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:3.211645699999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.7598999999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.6569999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:32.0 - frontier/cluster_49/score:3.4577299999999997 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.339899999999999 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.2319299999999993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.3598999999999997 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.9376456999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:3.3598999999999997 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:32.0 - frontier/cluster_59/score:2.8319299999999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:48.0 - frontier/cluster_62/score:3.1234456999999995 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.5179299999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:313.0 - cluster/prob_snapshot/cluster_0:0.019026502520841097 - cluster/prob_snapshot/cluster_1:0.013355652516985598 - cluster/prob_snapshot/cluster_2:0.007668107174551526 - cluster/prob_snapshot/cluster_3:0.021373420060922164 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013801047049509543 - cluster/prob_snapshot/cluster_6:0.01284026741506503 - cluster/prob_snapshot/cluster_7:0.019719006614704132 - cluster/prob_snapshot/cluster_8:0.018161830339374078 - cluster/prob_snapshot/cluster_9:0.01782289846195933 - cluster/prob_snapshot/cluster_10:0.02182760932680178 - cluster/prob_snapshot/cluster_11:0.015520481419340643 - cluster/prob_snapshot/cluster_12:0.012366917465610075 - cluster/prob_snapshot/cluster_13:0.010511879845768563 - cluster/prob_snapshot/cluster_14:0.02010817495697068 - cluster/prob_snapshot/cluster_15:0.015520481419340643 - cluster/prob_snapshot/cluster_16:0.01284026741506503 - cluster/prob_snapshot/cluster_17:0.01151577055449819 - cluster/prob_snapshot/cluster_18:0.021131634457552017 - cluster/prob_snapshot/cluster_19:0.015520481419340643 - cluster/prob_snapshot/cluster_20:0.01782289846195933 - cluster/prob_snapshot/cluster_21:0.013832397185284869 - cluster/prob_snapshot/cluster_22:0.012014460317840186 - cluster/prob_snapshot/cluster_23:0.016921356300340498 - cluster/prob_snapshot/cluster_24:0.021957461960190685 - cluster/prob_snapshot/cluster_25:0.012014460317840186 - cluster/prob_snapshot/cluster_26:0.011810406179657621 - cluster/prob_snapshot/cluster_27:0.0150036122873242 - cluster/prob_snapshot/cluster_28:0.014112823222276305 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017199346116425804 - cluster/prob_snapshot/cluster_31:0.013801047049509543 - cluster/prob_snapshot/cluster_32:0.015693185421747885 - cluster/prob_snapshot/cluster_33:0.01706572775666862 - cluster/prob_snapshot/cluster_34:0.014221955085451585 - cluster/prob_snapshot/cluster_35:0.021019961192837562 - cluster/prob_snapshot/cluster_36:0.017511122289192566 - cluster/prob_snapshot/cluster_37:0.0146115129856183 - cluster/prob_snapshot/cluster_38:0.01631428355251036 - cluster/prob_snapshot/cluster_39:0.015520481419340643 - cluster/prob_snapshot/cluster_40:0.02567951517778502 - cluster/prob_snapshot/cluster_41:0.022423762766690246 - cluster/prob_snapshot/cluster_42:0.010122321945601847 - cluster/prob_snapshot/cluster_43:0.018499452756863204 - cluster/prob_snapshot/cluster_44:0.019859078650340742 - cluster/prob_snapshot/cluster_45:0.012366917465610075 - cluster/prob_snapshot/cluster_46:0.01706572775666862 - cluster/prob_snapshot/cluster_47:0.01642944985306298 - cluster/prob_snapshot/cluster_48:0.014221955085451585 - cluster/prob_snapshot/cluster_49:0.02138073076418196 - cluster/prob_snapshot/cluster_50:0.012598481811694885 - cluster/prob_snapshot/cluster_51:0.013801047049509543 - cluster/prob_snapshot/cluster_52:0.014468675088890501 - cluster/prob_snapshot/cluster_53:0.013801047049509543 - cluster/prob_snapshot/cluster_54:0.020775802996351643 - cluster/prob_snapshot/cluster_55:0.0243482696903572 - cluster/prob_snapshot/cluster_56:0.01642944985306298 - cluster/prob_snapshot/cluster_57:0.01706572775666862 - cluster/prob_snapshot/cluster_58:0.020775802996351643 - cluster/prob_snapshot/cluster_59:0.017511122289192566 - cluster/prob_snapshot/cluster_60:0.010511879845768563 - cluster/prob_snapshot/cluster_61:0.012083096709774322 - cluster/prob_snapshot/cluster_62:0.01931369759010734 - cluster/prob_snapshot/cluster_63:0.009386057514286748
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 22:04:23,281:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  39%|███▉      | 314/800 [10:34:10<17:07:38, 126.87s/it]
[36m(TaskRunner pid=2823680)[0m step:314 - global_seqlen/min:429041 - global_seqlen/max:532938 - global_seqlen/minmax_diff:103897 - global_seqlen/balanced_min:464045 - global_seqlen/balanced_max:464189 - global_seqlen/mean:464117.5 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.1515887315943837) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0024339035153388977 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.0420623136596987) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.001912109985278221) - actor/ppo_kl:np.float64(-0.005616879339036132) - actor/pg_clipfrac_lower:np.float64(0.0004250785965996329) - actor/grad_norm:np.float64(0.42386984022764057) - perf/mfu/actor:np.float64(0.23600392852172236) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.2530632019043) - actor/lr:np.float64(1e-06) - training/global_step:314 - training/epoch:0 - critic/score/mean:0.6073232293128967 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6377291679382324 - critic/rewards/max:1.2007043361663818 - critic/rewards/min:-0.08381834626197815 - critic/advantages/mean:0.02741400897502899 - critic/advantages/max:2.474067449569702 - critic/advantages/min:-2.4746265411376953 - critic/returns/mean:0.02741400897502899 - critic/returns/max:2.474067449569702 - critic/returns/min:-2.4746265411376953 - response_length/mean:1356.0770263671875 - response_length/max:8192.0 - response_length/min:178.0 - response_length/clip_ratio:0.04419191926717758 - response_length_non_aborted/mean:1356.0770263671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:178.0 - response_length_non_aborted/clip_ratio:0.04419191926717758 - response/aborted_ratio:0.0 - prompt_length/mean:237.48484802246094 - prompt_length/max:477.0 - prompt_length/min:177.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.497387170791626e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4538085795938969) - timing_s/agent_loop/generate_sequences/max:np.float64(35.3889230620116) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.240249094366845) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.3889230620116) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.59344364050776 - timing_s/reward:0.0001453068107366562 - timing_s/old_log_prob:10.841265304014087 - timing_s/ref:24.42408750858158 - timing_s/adv:0.07508408557623625 - timing_s/update_actor:23.700432298704982 - timing_s/update_weights:30.264497510157526 - timing_s/step:127.29579831846058 - timing_s/stop_profile:5.086418241262436e-05 - timing_per_token_ms/adv:5.949134465168497e-05 - timing_per_token_ms/update_actor:0.018778554409437105 - timing_per_token_ms/gen:0.03500278268559855 - timing_per_token_ms/ref:0.01935192786360329 - perf/total_num_tokens:1856470 - perf/time_per_step:127.29579831846058 - perf/throughput:3645.9765847015638 - frontier/active_count:61.0 - frontier/completed_count:3.0 - frontier/blacklisted_count:586.0 - frontier/mean_score:2.644720206706557 - frontier/mean_frontier_pct:0.27283481969672135 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.1598999999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.4565476999999993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:3.1889929999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:80.0 - frontier/cluster_8/score:2.9371636690999994 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:48.0 - frontier/cluster_9/score:2.9176456999999996 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:32.0 - frontier/cluster_10/score:3.53 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.51 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.09 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.0765509999999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.2922119899999993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:2.8823509999999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.237 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.736551989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:48.0 - frontier/cluster_24/score:3.9856999999999996 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.4264109999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.2319299999999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.5379299999999994 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.7598999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.3993856999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.8823509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.6383750999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:96.0 - frontier/cluster_40/score:4.152937099999999 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.6264109999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:3.211645699999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:32.0 - frontier/cluster_46/score:2.8319299999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.3 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:3.9204109999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:32.0 - frontier/cluster_52/score:2.339899999999999 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.2319299999999993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.3598999999999997 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.9376456999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.1598999999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:32.0 - frontier/cluster_58/score:3.2519299999999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:48.0 - frontier/cluster_59/score:2.8823509999999994 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.0864119899999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:64.0 - frontier/cluster_63/score:1.5179299999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:314.0 - cluster/prob_snapshot/cluster_0:0.0190729525274189 - cluster/prob_snapshot/cluster_1:0.013388258096838506 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02142559967203737 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01383473998522004 - cluster/prob_snapshot/cluster_6:0.010869696720461695 - cluster/prob_snapshot/cluster_7:0.019767147253581797 - cluster/prob_snapshot/cluster_8:0.018206169394216383 - cluster/prob_snapshot/cluster_9:0.018085186196921642 - cluster/prob_snapshot/cluster_10:0.02188089776463722 - cluster/prob_snapshot/cluster_11:0.015558372064940346 - cluster/prob_snapshot/cluster_12:0.012397109215091911 - cluster/prob_snapshot/cluster_13:0.012954979129771045 - cluster/prob_snapshot/cluster_14:0.020157265684916915 - cluster/prob_snapshot/cluster_15:0.020188692356777175 - cluster/prob_snapshot/cluster_16:0.012871614768854157 - cluster/prob_snapshot/cluster_17:0.011543884371917813 - cluster/prob_snapshot/cluster_18:0.020406955799632535 - cluster/prob_snapshot/cluster_19:0.015558372064940346 - cluster/prob_snapshot/cluster_20:0.01786641007161469 - cluster/prob_snapshot/cluster_21:0.013866166657080303 - cluster/prob_snapshot/cluster_22:0.01204379160246179 - cluster/prob_snapshot/cluster_23:0.016962666946403547 - cluster/prob_snapshot/cluster_24:0.02470557909929591 - cluster/prob_snapshot/cluster_25:0.01204379160246179 - cluster/prob_snapshot/cluster_26:0.011839239300412774 - cluster/prob_snapshot/cluster_27:0.015040241083850185 - cluster/prob_snapshot/cluster_28:0.014147277307087114 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017241335427880537 - cluster/prob_snapshot/cluster_31:0.01383473998522004 - cluster/prob_snapshot/cluster_32:0.0157314976951291 - cluster/prob_snapshot/cluster_33:0.01710739086136608 - cluster/prob_snapshot/cluster_34:0.014256675597355696 - cluster/prob_snapshot/cluster_35:0.02107127789356083 - cluster/prob_snapshot/cluster_36:0.01786641007161469 - cluster/prob_snapshot/cluster_37:0.01211259555860555 - cluster/prob_snapshot/cluster_38:0.016354112132539515 - cluster/prob_snapshot/cluster_39:0.015558372064940346 - cluster/prob_snapshot/cluster_40:0.02574220739605353 - cluster/prob_snapshot/cluster_41:0.022478506612905333 - cluster/prob_snapshot/cluster_42:0.01014703389255273 - cluster/prob_snapshot/cluster_43:0.01854461606006624 - cluster/prob_snapshot/cluster_44:0.01990756125154015 - cluster/prob_snapshot/cluster_45:0.012397109215091911 - cluster/prob_snapshot/cluster_46:0.017553872749747613 - cluster/prob_snapshot/cluster_47:0.01710739086136608 - cluster/prob_snapshot/cluster_48:0.014256675597355696 - cluster/prob_snapshot/cluster_49:0.024300881667523845 - cluster/prob_snapshot/cluster_50:0.012629238886589894 - cluster/prob_snapshot/cluster_51:0.01383473998522004 - cluster/prob_snapshot/cluster_52:0.014503997926196775 - cluster/prob_snapshot/cluster_53:0.01383473998522004 - cluster/prob_snapshot/cluster_54:0.020826523625893652 - cluster/prob_snapshot/cluster_55:0.024407711896618514 - cluster/prob_snapshot/cluster_56:0.013388258096838506 - cluster/prob_snapshot/cluster_57:0.01710739086136608 - cluster/prob_snapshot/cluster_58:0.020157265684916915 - cluster/prob_snapshot/cluster_59:0.01786641007161469 - cluster/prob_snapshot/cluster_60:0.010537542832828123 - cluster/prob_snapshot/cluster_61:0.01211259555860555 - cluster/prob_snapshot/cluster_62:0.019131293261399576 - cluster/prob_snapshot/cluster_63:0.009408971995432228
[36m(TaskRunner pid=2823680)[0m Training Progress:  39%|███▉      | 315/800 [10:36:10<16:47:03, 124.58s/it]
[36m(TaskRunner pid=2823680)[0m step:315 - global_seqlen/min:295930 - global_seqlen/max:446446 - global_seqlen/minmax_diff:150516 - global_seqlen/balanced_min:395624 - global_seqlen/balanced_max:395686 - global_seqlen/mean:395661.5 - frontier/skipped_zero_acc_count:21.0 - actor/entropy:np.float64(0.18613214970186906) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011341146193444729 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.003727561794221401) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0013230225891607847) - actor/ppo_kl:np.float64(0.005313104319123307) - actor/pg_clipfrac_lower:np.float64(4.8265490477206185e-05) - actor/grad_norm:np.float64(0.3682651902948107) - perf/mfu/actor:np.float64(0.1565512587129445) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.9092788696289) - actor/lr:np.float64(1e-06) - training/global_step:315 - training/epoch:0 - critic/score/mean:0.6378504633903503 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6559203863143921 - critic/rewards/max:1.222760796546936 - critic/rewards/min:-0.06801710277795792 - critic/advantages/mean:-0.12827157974243164 - critic/advantages/max:2.4747886657714844 - critic/advantages/min:-2.4748568534851074 - critic/returns/mean:-0.12827157974243164 - critic/returns/max:2.4747886657714844 - critic/returns/min:-2.4748568534851074 - response_length/mean:1286.314208984375 - response_length/max:8192.0 - response_length/min:136.0 - response_length/clip_ratio:0.05257009342312813 - response_length_non_aborted/mean:1286.314208984375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:136.0 - response_length_non_aborted/clip_ratio:0.05257009342312813 - response/aborted_ratio:0.0 - prompt_length/mean:237.8224334716797 - prompt_length/max:359.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.669309318065643e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2250514421612024) - timing_s/agent_loop/generate_sequences/max:np.float64(33.000771316699684) - timing_s/agent_loop/generate_sequences/mean:np.float64(6.702612696895812) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(33.000771316699684) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:203 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.752075606025755 - timing_s/reward:0.00012824777513742447 - timing_s/old_log_prob:12.843248320743442 - timing_s/ref:14.817297032102942 - timing_s/adv:0.08820447791367769 - timing_s/update_actor:30.17748643271625 - timing_s/update_weights:25.960592890158296 - timing_s/step:119.03949817363173 - timing_s/stop_profile:5.065184086561203e-05 - timing_per_token_ms/adv:6.760720057829405e-05 - timing_per_token_ms/update_actor:0.02313051929406662 - timing_per_token_ms/gen:0.03156166472708806 - timing_per_token_ms/ref:0.011357200860685606 - perf/total_num_tokens:1582646 - perf/time_per_step:119.03949817363173 - perf/throughput:3323.783333015112 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:607.0 - frontier/mean_score:2.650106546639499 - frontier/mean_frontier_pct:0.2785908911552298 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.4119299999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.4565476999999993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:48.0 - frontier/cluster_7/score:3.1889929999999995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:2.356014568369999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.9423519899999997 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.9709999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.51 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.09 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:64.0 - frontier/cluster_16/score:2.353585699999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:64.0 - frontier/cluster_18/score:3.2922119899999993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.51 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:2.8823509999999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.237 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.736551989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:4.2899899999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.4264109999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.2319299999999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.7598999999999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:64.0 - frontier/cluster_35/score:3.3993856999999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.8823509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.6383750999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4384876999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:1.637 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:3.211645699999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.8823509999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.7598999999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:48.0 - frontier/cluster_49/score:3.9204109999999996 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.5379299999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.2319299999999993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.3598999999999997 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.9376456999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.1598999999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:3.1763509999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.5176456999999997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.0864119899999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:1.3625509999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:315.0 - cluster/prob_snapshot/cluster_0:0.01935142320913995 - cluster/prob_snapshot/cluster_1:0.015168761189737054 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021738419689723534 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01403673123275129 - cluster/prob_snapshot/cluster_6:0.011028397469676932 - cluster/prob_snapshot/cluster_7:0.02005575338121054 - cluster/prob_snapshot/cluster_8:0.014817105947165113 - cluster/prob_snapshot/cluster_9:0.018504614425981513 - cluster/prob_snapshot/cluster_10:0.024973838662169238 - cluster/prob_snapshot/cluster_11:0.0157855288446348 - cluster/prob_snapshot/cluster_12:0.01446482722815141 - cluster/prob_snapshot/cluster_13:0.01314412561166802 - cluster/prob_snapshot/cluster_14:0.020451567655670612 - cluster/prob_snapshot/cluster_15:0.020483453166125713 - cluster/prob_snapshot/cluster_16:0.014801830659629473 - cluster/prob_snapshot/cluster_17:0.011712428457902173 - cluster/prob_snapshot/cluster_18:0.02070490331904284 - cluster/prob_snapshot/cluster_19:0.0157855288446348 - cluster/prob_snapshot/cluster_20:0.018127264880821496 - cluster/prob_snapshot/cluster_21:0.014068616743206395 - cluster/prob_snapshot/cluster_22:0.012219634480129648 - cluster/prob_snapshot/cluster_23:0.017210326841827786 - cluster/prob_snapshot/cluster_24:0.026979984417607506 - cluster/prob_snapshot/cluster_25:0.012219634480129648 - cluster/prob_snapshot/cluster_26:0.012012095654682258 - cluster/prob_snapshot/cluster_27:0.01525983299977656 - cluster/prob_snapshot/cluster_28:0.014353831690868953 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01749306396458617 - cluster/prob_snapshot/cluster_31:0.01403673123275129 - cluster/prob_snapshot/cluster_32:0.01683297729666777 - cluster/prob_snapshot/cluster_33:0.01735716376825003 - cluster/prob_snapshot/cluster_34:0.01446482722815141 - cluster/prob_snapshot/cluster_35:0.021378924709716755 - cluster/prob_snapshot/cluster_36:0.018127264880821496 - cluster/prob_snapshot/cluster_37:0.012289442994143768 - cluster/prob_snapshot/cluster_38:0.01659288694980726 - cluster/prob_snapshot/cluster_39:0.0157855288446348 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021624839350705964 - cluster/prob_snapshot/cluster_42:0.010295183553253852 - cluster/prob_snapshot/cluster_43:0.01881537287493682 - cluster/prob_snapshot/cluster_44:0.020198217464580604 - cluster/prob_snapshot/cluster_45:0.01257811063317514 - cluster/prob_snapshot/cluster_46:0.018127264880821496 - cluster/prob_snapshot/cluster_47:0.01735716376825003 - cluster/prob_snapshot/cluster_48:0.0157855288446348 - cluster/prob_snapshot/cluster_49:0.02465568164275839 - cluster/prob_snapshot/cluster_50:0.012813629465726022 - cluster/prob_snapshot/cluster_51:0.01403673123275129 - cluster/prob_snapshot/cluster_52:0.015961182159627087 - cluster/prob_snapshot/cluster_53:0.01403673123275129 - cluster/prob_snapshot/cluster_54:0.021130596958202574 - cluster/prob_snapshot/cluster_55:0.024764071624423182 - cluster/prob_snapshot/cluster_56:0.013583730578297488 - cluster/prob_snapshot/cluster_57:0.01735716376825003 - cluster/prob_snapshot/cluster_58:0.01997624714389824 - cluster/prob_snapshot/cluster_59:0.022122668391456404 - cluster/prob_snapshot/cluster_60:0.01069139403819887 - cluster/prob_snapshot/cluster_61:0.012289442994143768 - cluster/prob_snapshot/cluster_62:0.019410615734889117 - cluster/prob_snapshot/cluster_63:0.008569158610671708
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|███▉      | 316/800 [10:38:21<17:01:55, 126.69s/it]
[36m(TaskRunner pid=2823680)[0m step:316 - global_seqlen/min:397789 - global_seqlen/max:553921 - global_seqlen/minmax_diff:156132 - global_seqlen/balanced_min:485768 - global_seqlen/balanced_max:486306 - global_seqlen/mean:486037.5 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.15073261494344137) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.003757308702915907 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.05431059907641611) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011181578787350104) - actor/ppo_kl:np.float64(0.0003486933452214149) - actor/pg_clipfrac_lower:np.float64(4.147649214812774e-05) - actor/grad_norm:np.float64(0.5000175833702087) - perf/mfu/actor:np.float64(0.2105527685867779) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.34487819671631) - actor/lr:np.float64(1e-06) - training/global_step:316 - training/epoch:0 - critic/score/mean:0.5892857313156128 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6345540881156921 - critic/rewards/max:1.3050365447998047 - critic/rewards/min:-0.17145578563213348 - critic/advantages/mean:0.04085973650217056 - critic/advantages/max:2.47440505027771 - critic/advantages/min:-2.474696159362793 - critic/returns/mean:0.04085973650217056 - critic/returns/max:2.47440505027771 - critic/returns/min:-2.474696159362793 - response_length/mean:1548.199951171875 - response_length/max:8192.0 - response_length/min:231.0 - response_length/clip_ratio:0.05476190522313118 - response_length_non_aborted/mean:1548.199951171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:231.0 - response_length_non_aborted/clip_ratio:0.05476190522313118 - response/aborted_ratio:0.0 - prompt_length/mean:237.91429138183594 - prompt_length/max:363.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.992478251457214e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.9437340619042516) - timing_s/agent_loop/generate_sequences/max:np.float64(35.82723133917898) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.993704352069471) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.82723133917898) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:198 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:38.03775081504136 - timing_s/reward:0.00012087356299161911 - timing_s/old_log_prob:13.915591158904135 - timing_s/ref:19.527236541733146 - timing_s/adv:0.07189483288675547 - timing_s/update_actor:27.78322947025299 - timing_s/update_weights:31.644693210721016 - timing_s/step:131.36736608669162 - timing_s/stop_profile:5.893409252166748e-05 - timing_per_token_ms/adv:4.79191547005174e-05 - timing_per_token_ms/update_actor:0.018518004947060518 - timing_per_token_ms/gen:0.029248828758928466 - timing_per_token_ms/ref:0.013015242280217994 - perf/total_num_tokens:1944150 - perf/time_per_step:131.36736608669162 - perf/throughput:3699.834399353454 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:630.0 - frontier/mean_score:2.6686688662394995 - frontier/mean_frontier_pct:0.2872997445476664 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.4119299999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.4565476999999993 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:3.1322950999999994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:96.0 - frontier/cluster_8/score:2.356014568369999 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.9596463929999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.9709999999999996 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.51 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.09 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:80.0 - frontier/cluster_16/score:2.547509989999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.204548392999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:2.9176456999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:16.0 - frontier/cluster_21/score:2.237 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.736551989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:4.2899899999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:32.0 - frontier/cluster_25/score:1.9429999999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.237 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.4264109999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.2319299999999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:32.0 - frontier/cluster_33/score:2.8319299999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.2795699899999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.8823509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:64.0 - frontier/cluster_38/score:2.6383750999999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4384876999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:32.0 - frontier/cluster_42/score:2.0458999999999996 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:3.1481519899999992 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.8823509999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:2.8319299999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:16.0 - frontier/cluster_48/score:2.51 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:4.244287699999999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:48.0 - frontier/cluster_52/score:2.5379299999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.2319299999999993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.2519299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.9376456999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:48.0 - frontier/cluster_58/score:3.1763509999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.5176456999999997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.0864119899999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:1.3625509999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:316.0 - cluster/prob_snapshot/cluster_0:0.01921682153305074 - cluster/prob_snapshot/cluster_1:0.015063252635752705 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021587214907824833 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01393909667996813 - cluster/prob_snapshot/cluster_6:0.01095168782574256 - cluster/prob_snapshot/cluster_7:0.019562156621887983 - cluster/prob_snapshot/cluster_8:0.014714043383046427 - cluster/prob_snapshot/cluster_9:0.018483911776151562 - cluster/prob_snapshot/cluster_10:0.024800129446780794 - cluster/prob_snapshot/cluster_11:0.015675730272329335 - cluster/prob_snapshot/cluster_12:0.014364214990580666 - cluster/prob_snapshot/cluster_13:0.013052699708831998 - cluster/prob_snapshot/cluster_14:0.02030931376274738 - cluster/prob_snapshot/cluster_15:0.02034097748883532 - cluster/prob_snapshot/cluster_16:0.01590999182044 - cluster/prob_snapshot/cluster_17:0.011630960935618648 - cluster/prob_snapshot/cluster_18:0.02001340089772686 - cluster/prob_snapshot/cluster_19:0.016593790969553403 - cluster/prob_snapshot/cluster_20:0.0182216043918014 - cluster/prob_snapshot/cluster_21:0.013970760406056068 - cluster/prob_snapshot/cluster_22:0.012134639011607929 - cluster/prob_snapshot/cluster_23:0.017090617877070148 - cluster/prob_snapshot/cluster_24:0.02679232115975702 - cluster/prob_snapshot/cluster_25:0.012134639011607929 - cluster/prob_snapshot/cluster_26:0.013970760406056068 - cluster/prob_snapshot/cluster_27:0.015153690982395574 - cluster/prob_snapshot/cluster_28:0.014253991499115987 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017371388380102187 - cluster/prob_snapshot/cluster_31:0.01393909667996813 - cluster/prob_snapshot/cluster_32:0.016715893042284203 - cluster/prob_snapshot/cluster_33:0.017686283199250044 - cluster/prob_snapshot/cluster_34:0.014364214990580666 - cluster/prob_snapshot/cluster_35:0.02048193409261586 - cluster/prob_snapshot/cluster_36:0.0180011780183979 - cluster/prob_snapshot/cluster_37:0.012203961962214642 - cluster/prob_snapshot/cluster_38:0.01647747267921511 - cluster/prob_snapshot/cluster_39:0.015675730272329335 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02147442459359445 - cluster/prob_snapshot/cluster_42:0.012777281499664775 - cluster/prob_snapshot/cluster_43:0.018684499775948746 - cluster/prob_snapshot/cluster_44:0.019661187829297544 - cluster/prob_snapshot/cluster_45:0.010617028471298754 - cluster/prob_snapshot/cluster_46:0.0180011780183979 - cluster/prob_snapshot/cluster_47:0.017686283199250044 - cluster/prob_snapshot/cluster_48:0.015675730272329335 - cluster/prob_snapshot/cluster_49:0.026506896088990057 - cluster/prob_snapshot/cluster_50:0.012724502377540687 - cluster/prob_snapshot/cluster_51:0.01393909667996813 - cluster/prob_snapshot/cluster_52:0.015850161804801907 - cluster/prob_snapshot/cluster_53:0.01393909667996813 - cluster/prob_snapshot/cluster_54:0.02030931376274738 - cluster/prob_snapshot/cluster_55:0.024591821474580652 - cluster/prob_snapshot/cluster_56:0.015063252635752705 - cluster/prob_snapshot/cluster_57:0.01723643345761025 - cluster/prob_snapshot/cluster_58:0.019837299412846038 - cluster/prob_snapshot/cluster_59:0.021968790911083315 - cluster/prob_snapshot/cluster_60:0.010617028471298754 - cluster/prob_snapshot/cluster_61:0.012203961962214642 - cluster/prob_snapshot/cluster_62:0.019275602336463436 - cluster/prob_snapshot/cluster_63:0.008509554565056815
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 22:10:50,377:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|███▉      | 317/800 [10:41:01<18:19:48, 136.62s/it]
[36m(TaskRunner pid=2823680)[0m step:317 - global_seqlen/min:447757 - global_seqlen/max:630110 - global_seqlen/minmax_diff:182353 - global_seqlen/balanced_min:513771 - global_seqlen/balanced_max:513890 - global_seqlen/mean:513838.0 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.17254468007013202) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.002584237139672041 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.04048438079189509) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011643197174872394) - actor/ppo_kl:np.float64(0.000378998405425512) - actor/pg_clipfrac_lower:np.float64(1.7235345121266818e-05) - actor/grad_norm:np.float64(0.7235311816136042) - perf/mfu/actor:np.float64(0.2630769035045905) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.3648796081543) - actor/lr:np.float64(1e-06) - training/global_step:317 - training/epoch:0 - critic/score/mean:0.4850543439388275 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.537207305431366 - critic/rewards/max:1.6049216985702515 - critic/rewards/min:-0.06265755742788315 - critic/advantages/mean:0.050748493522405624 - critic/advantages/max:2.4747533798217773 - critic/advantages/min:-2.4747846126556396 - critic/returns/mean:0.050748493522405624 - critic/returns/max:2.4747533798217773 - critic/returns/min:-2.4747846126556396 - response_length/mean:1550.1385498046875 - response_length/max:8192.0 - response_length/min:186.0 - response_length/clip_ratio:0.05163043364882469 - response_length_non_aborted/mean:1550.1385498046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:186.0 - response_length_non_aborted/clip_ratio:0.05163043364882469 - response/aborted_ratio:0.0 - prompt_length/mean:249.97825622558594 - prompt_length/max:544.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.194668382406235e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.895139668136835) - timing_s/agent_loop/generate_sequences/max:np.float64(39.63749131094664) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.711019269371718) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(39.63749131094664) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:41.713014867156744 - timing_s/reward:0.00013742223381996155 - timing_s/old_log_prob:14.212282771244645 - timing_s/ref:34.519045979715884 - timing_s/adv:0.06868475303053856 - timing_s/update_actor:23.631463509052992 - timing_s/update_weights:45.008861928246915 - timing_s/step:159.56544403731823 - timing_s/stop_profile:5.5058859288692474e-05 - timing_per_token_ms/adv:5.184200982615754e-05 - timing_per_token_ms/update_actor:0.017836601420086702 - timing_per_token_ms/gen:0.03656143548451729 - timing_per_token_ms/ref:0.026054351830811017 - perf/total_num_tokens:2055352 - perf/time_per_step:159.56544403731823 - perf/throughput:3220.2335731276917 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:666.0 - frontier/mean_score:2.6245899489476496 - frontier/mean_frontier_pct:0.3077273653684371 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:7.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:32.0 - frontier/cluster_1/score:2.4119299999999995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.319583389999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.2319299999999993 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:64.0 - frontier/cluster_7/score:3.092606569999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.9492101978589993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.9596463929999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.6796999999999995 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.6569999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:96.0 - frontier/cluster_16/score:2.083256992999999 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.204548392999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:48.0 - frontier/cluster_20/score:2.9176456999999996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.4659 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:32.0 - frontier/cluster_22/score:1.9429999999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.736551989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:4.2899899999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.237 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.4264109999999994 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:48.0 - frontier/cluster_30/score:2.7815089999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.2319299999999993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.8823509999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.2795699899999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.8823509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:80.0 - frontier/cluster_38/score:2.7468625699999993 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4384876999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.7321299999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:3.1481519899999992 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.8823509999999994 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.0569999999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:64.0 - frontier/cluster_49/score:4.244287699999999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.0765509999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.2319299999999993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:32.0 - frontier/cluster_54/score:3.2519299999999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.9376456999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.5234456999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.5176456999999997 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.0864119899999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:1.3625509999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:317.0 - cluster/prob_snapshot/cluster_0:0.019539560209737065 - cluster/prob_snapshot/cluster_1:0.015316233817572673 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021080012843727028 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.014173198121195463 - cluster/prob_snapshot/cluster_6:0.011135616954203417 - cluster/prob_snapshot/cluster_7:0.019638665024226006 - cluster/prob_snapshot/cluster_8:0.012377871310529549 - cluster/prob_snapshot/cluster_9:0.01879434153251694 - cluster/prob_snapshot/cluster_10:0.023366824733106754 - cluster/prob_snapshot/cluster_11:0.016872476918190243 - cluster/prob_snapshot/cluster_12:0.014605456120375447 - cluster/prob_snapshot/cluster_13:0.015005518614107469 - cluster/prob_snapshot/cluster_14:0.020650400400666315 - cluster/prob_snapshot/cluster_15:0.020682595906114275 - cluster/prob_snapshot/cluster_16:0.013229095042924691 - cluster/prob_snapshot/cluster_17:0.011826298178798838 - cluster/prob_snapshot/cluster_18:0.02034951779981789 - cluster/prob_snapshot/cluster_19:0.016872476918190243 - cluster/prob_snapshot/cluster_20:0.018527628802674827 - cluster/prob_snapshot/cluster_21:0.015658954020536443 - cluster/prob_snapshot/cluster_22:0.01233843532256065 - cluster/prob_snapshot/cluster_23:0.017377647830900476 - cluster/prob_snapshot/cluster_24:0.02724228726167368 - cluster/prob_snapshot/cluster_25:0.010541964219754469 - cluster/prob_snapshot/cluster_26:0.014205393626643427 - cluster/prob_snapshot/cluster_27:0.015408191039346219 - cluster/prob_snapshot/cluster_28:0.01449338147034566 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017663133759969297 - cluster/prob_snapshot/cluster_31:0.014173198121195463 - cluster/prob_snapshot/cluster_32:0.016996629645411744 - cluster/prob_snapshot/cluster_33:0.01830350045826969 - cluster/prob_snapshot/cluster_34:0.014605456120375447 - cluster/prob_snapshot/cluster_35:0.020825919818541365 - cluster/prob_snapshot/cluster_36:0.01830350045826969 - cluster/prob_snapshot/cluster_37:0.012408922523837244 - cluster/prob_snapshot/cluster_38:0.017443122058624663 - cluster/prob_snapshot/cluster_39:0.015938997766148856 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021835078792522042 - cluster/prob_snapshot/cluster_42:0.010999369004254748 - cluster/prob_snapshot/cluster_43:0.018998298325925617 - cluster/prob_snapshot/cluster_44:0.019991389456616365 - cluster/prob_snapshot/cluster_45:0.010795337132451418 - cluster/prob_snapshot/cluster_46:0.01830350045826969 - cluster/prob_snapshot/cluster_47:0.01830350045826969 - cluster/prob_snapshot/cluster_48:0.013062357930266213 - cluster/prob_snapshot/cluster_49:0.02695206859330401 - cluster/prob_snapshot/cluster_50:0.012938205203044707 - cluster/prob_snapshot/cluster_51:0.014173198121195463 - cluster/prob_snapshot/cluster_52:0.013186510657487717 - cluster/prob_snapshot/cluster_53:0.014173198121195463 - cluster/prob_snapshot/cluster_54:0.020650400400666315 - cluster/prob_snapshot/cluster_55:0.025004831082145677 - cluster/prob_snapshot/cluster_56:0.015316233817572673 - cluster/prob_snapshot/cluster_57:0.017525912324619215 - cluster/prob_snapshot/cluster_58:0.01602438062760874 - cluster/prob_snapshot/cluster_59:0.02233774779059886 - cluster/prob_snapshot/cluster_60:0.010795337132451418 - cluster/prob_snapshot/cluster_61:0.012408922523837244 - cluster/prob_snapshot/cluster_62:0.019599328212758983 - cluster/prob_snapshot/cluster_63:0.008652469061858122
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|███▉      | 318/800 [10:43:35<18:59:46, 141.88s/it]
[36m(TaskRunner pid=2823680)[0m step:318 - global_seqlen/min:403979 - global_seqlen/max:468363 - global_seqlen/minmax_diff:64384 - global_seqlen/balanced_min:444167 - global_seqlen/balanced_max:444299 - global_seqlen/mean:444231.0 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.16822800340984637) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.002540620742365718 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.014420250066905282) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012726726554319612) - actor/ppo_kl:np.float64(-0.00023881040313729804) - actor/pg_clipfrac_lower:np.float64(5.9098148994962685e-05) - actor/grad_norm:np.float64(0.5026897092660269) - perf/mfu/actor:np.float64(0.2033944913857644) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.43678665161133) - actor/lr:np.float64(1e-06) - training/global_step:318 - training/epoch:0 - critic/score/mean:0.6106770634651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.649493396282196 - critic/rewards/max:1.4844274520874023 - critic/rewards/min:-0.06505753099918365 - critic/advantages/mean:-0.0605429969727993 - critic/advantages/max:2.4746298789978027 - critic/advantages/min:-2.4743735790252686 - critic/returns/mean:-0.0605429969727993 - critic/returns/max:2.4746298789978027 - critic/returns/min:-2.4743735790252686 - response_length/mean:1477.39453125 - response_length/max:8192.0 - response_length/min:175.0 - response_length/clip_ratio:0.046875 - response_length_non_aborted/mean:1477.39453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:175.0 - response_length_non_aborted/clip_ratio:0.046875 - response/aborted_ratio:0.0 - prompt_length/mean:235.1770782470703 - prompt_length/max:401.0 - prompt_length/min:187.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.0001093987375497818 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5165807334706187) - timing_s/agent_loop/generate_sequences/max:np.float64(34.278933313675225) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.8988722155472715) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.278933313675225) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.89544011186808 - timing_s/reward:0.00010879896581172943 - timing_s/old_log_prob:14.55953168682754 - timing_s/ref:34.09130073618144 - timing_s/adv:0.08810688275843859 - timing_s/update_actor:26.092864841222763 - timing_s/update_weights:42.79730139952153 - timing_s/step:153.92464217264205 - timing_s/stop_profile:0.00010208785533905029 - timing_per_token_ms/adv:6.698844160139181e-05 - timing_per_token_ms/update_actor:0.01983863573316411 - timing_per_token_ms/gen:0.03163600062387075 - timing_per_token_ms/ref:0.02591991722987667 - perf/total_num_tokens:1776924 - perf/time_per_step:153.92464217264205 - perf/throughput:2886.0291226257978 - frontier/active_count:60.0 - frontier/completed_count:4.0 - frontier/blacklisted_count:698.0 - frontier/mean_score:2.6451589677826495 - frontier/mean_frontier_pct:0.32407416636607883 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:1.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:2.5883509999999994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.319583389999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:48.0 - frontier/cluster_5/score:2.462350999999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:3.064824598999999 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.9492101978589993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.9596463929999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:48.0 - frontier/cluster_10/score:3.6796999999999995 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.204548392999999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:16.0 - frontier/cluster_19/score:2.6569999999999996 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9423519899999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.62613 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:2.8601 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:64.0 - frontier/cluster_23/score:2.736551989999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:64.0 - frontier/cluster_24/score:4.2899899999999995 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:2.237 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.5984876999999997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.2823509999999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.8470562999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.462350999999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.8823509999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.2795699899999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.8823509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.2228037989999994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4384876999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.7321299999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:64.0 - frontier/cluster_44/score:3.1481519899999992 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.9176456999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:32.0 - frontier/cluster_48/score:2.0569999999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:4.47100139 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.0765509999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.2319299999999993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.1763509999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.9376456999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.5234456999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.3623519899999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:1.9540999999999997 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.0864119899999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:80.0 - frontier/cluster_63/score:1.3625509999999996 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:318.0 - cluster/prob_snapshot/cluster_0:0.019387618648992758 - cluster/prob_snapshot/cluster_1:0.01630872997001594 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.020916092570312184 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015514826833918086 - cluster/prob_snapshot/cluster_6:0.011049025290844007 - cluster/prob_snapshot/cluster_7:0.019310903656634428 - cluster/prob_snapshot/cluster_8:0.012281619753922002 - cluster/prob_snapshot/cluster_9:0.0186481948662174 - cluster/prob_snapshot/cluster_10:0.023185121983327478 - cluster/prob_snapshot/cluster_11:0.017389629089813164 - cluster/prob_snapshot/cluster_12:0.01449188264305601 - cluster/prob_snapshot/cluster_13:0.014888834211104934 - cluster/prob_snapshot/cluster_14:0.02048982083627527 - cluster/prob_snapshot/cluster_15:0.020521765986275402 - cluster/prob_snapshot/cluster_16:0.01107860256236262 - cluster/prob_snapshot/cluster_17:0.011734335709642607 - cluster/prob_snapshot/cluster_18:0.020191277928412923 - cluster/prob_snapshot/cluster_19:0.01674127486199992 - cluster/prob_snapshot/cluster_20:0.01853922597114883 - cluster/prob_snapshot/cluster_21:0.016546768593655947 - cluster/prob_snapshot/cluster_22:0.018020971107567174 - cluster/prob_snapshot/cluster_23:0.017242517515522335 - cluster/prob_snapshot/cluster_24:0.027030448530384284 - cluster/prob_snapshot/cluster_25:0.01045998885901621 - cluster/prob_snapshot/cluster_26:0.014094931075007087 - cluster/prob_snapshot/cluster_27:0.016372599477315015 - cluster/prob_snapshot/cluster_28:0.014380679496635443 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017938785120770982 - cluster/prob_snapshot/cluster_31:0.015514826833918086 - cluster/prob_snapshot/cluster_32:0.016864462165284432 - cluster/prob_snapshot/cluster_33:0.018161170620910924 - cluster/prob_snapshot/cluster_34:0.01449188264305601 - cluster/prob_snapshot/cluster_35:0.020663975397725378 - cluster/prob_snapshot/cluster_36:0.018161170620910924 - cluster/prob_snapshot/cluster_37:0.012312429509911195 - cluster/prob_snapshot/cluster_38:0.014005483388542198 - cluster/prob_snapshot/cluster_39:0.01581505453655243 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021665287051300688 - cluster/prob_snapshot/cluster_42:0.01091383681848548 - cluster/prob_snapshot/cluster_43:0.01885056567134475 - cluster/prob_snapshot/cluster_44:0.019835934426775316 - cluster/prob_snapshot/cluster_45:0.01071139151878053 - cluster/prob_snapshot/cluster_46:0.01838355612105087 - cluster/prob_snapshot/cluster_47:0.018161170620910924 - cluster/prob_snapshot/cluster_48:0.012960783737724439 - cluster/prob_snapshot/cluster_49:0.028170968452530563 - cluster/prob_snapshot/cluster_50:0.01283759643443992 - cluster/prob_snapshot/cluster_51:0.014062985925006953 - cluster/prob_snapshot/cluster_52:0.013083971041008955 - cluster/prob_snapshot/cluster_53:0.014062985925006953 - cluster/prob_snapshot/cluster_54:0.02001361127180591 - cluster/prob_snapshot/cluster_55:0.024810391032319187 - cluster/prob_snapshot/cluster_56:0.0151971332622896 - cluster/prob_snapshot/cluster_57:0.017389629089813164 - cluster/prob_snapshot/cluster_58:0.01589977345240188 - cluster/prob_snapshot/cluster_59:0.021185569758141665 - cluster/prob_snapshot/cluster_60:0.01071139151878053 - cluster/prob_snapshot/cluster_61:0.012312429509911195 - cluster/prob_snapshot/cluster_62:0.01944692189008737 - cluster/prob_snapshot/cluster_63:0.008585186603121133
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|███▉      | 319/800 [10:46:00<19:03:57, 142.70s/it]
[36m(TaskRunner pid=2823680)[0m step:319 - global_seqlen/min:410781 - global_seqlen/max:479804 - global_seqlen/minmax_diff:69023 - global_seqlen/balanced_min:437326 - global_seqlen/balanced_max:437617 - global_seqlen/mean:437433.0 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.1666703484547527) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010477867908775806 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.055128930136561394) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.001223447576510649) - actor/ppo_kl:np.float64(0.00025512903574587614) - actor/pg_clipfrac_lower:np.float64(9.418225604547288e-06) - actor/grad_norm:np.float64(0.4156293335060279) - perf/mfu/actor:np.float64(0.2337122442978483) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.32754135131836) - actor/lr:np.float64(1e-06) - training/global_step:319 - training/epoch:0 - critic/score/mean:0.6847826242446899 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.7064488530158997 - critic/rewards/max:1.5486812591552734 - critic/rewards/min:-0.11268407106399536 - critic/advantages/mean:-0.08357354998588562 - critic/advantages/max:2.472534418106079 - critic/advantages/min:-2.474533796310425 - critic/returns/mean:-0.08357354998588562 - critic/returns/max:2.472534418106079 - critic/returns/min:-2.474533796310425 - response_length/mean:1413.7296142578125 - response_length/max:8192.0 - response_length/min:157.0 - response_length/clip_ratio:0.05298912897706032 - response_length_non_aborted/mean:1413.7296142578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:157.0 - response_length_non_aborted/clip_ratio:0.05298912897706032 - response/aborted_ratio:0.0 - prompt_length/mean:233.6086883544922 - prompt_length/max:399.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.415671229362488e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.993645953014493) - timing_s/agent_loop/generate_sequences/max:np.float64(32.71591903362423) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.67002045316076) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(32.71591903362423) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:208 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:34.82777324318886 - timing_s/reward:0.0001421971246600151 - timing_s/old_log_prob:13.484889189712703 - timing_s/ref:32.12984893005341 - timing_s/adv:0.06792933866381645 - timing_s/update_actor:22.385456454008818 - timing_s/update_weights:41.04639865551144 - timing_s/step:144.33528778329492 - timing_s/stop_profile:5.21242618560791e-05 - timing_per_token_ms/adv:5.602692309466312e-05 - timing_per_token_ms/update_actor:0.018463130539142784 - timing_per_token_ms/gen:0.033471990277018235 - timing_per_token_ms/ref:0.026500133969449575 - perf/total_num_tokens:1749732 - perf/time_per_step:144.33528778329492 - perf/throughput:3030.6725868504323 - frontier/active_count:59.0 - frontier/completed_count:5.0 - frontier/blacklisted_count:734.0 - frontier/mean_score:2.696878509751846 - frontier/mean_frontier_pct:0.3364173525426304 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:48.0 - frontier/cluster_1/score:2.7118456999999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.319583389999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:2.623645699999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:80.0 - frontier/cluster_7/score:3.0453772192999993 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.9492101978589993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.9596463929999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.07579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.143183875099999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:3.3598999999999997 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9423519899999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.62613 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:48.0 - frontier/cluster_22/score:2.8601 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.815586392999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:4.502993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:32.0 - frontier/cluster_26/score:1.8659000000000001 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.5984876999999997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:48.0 - frontier/cluster_28/score:2.4976456999999996 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.8470562999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.462350999999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:48.0 - frontier/cluster_33/score:2.8823509999999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:80.0 - frontier/cluster_35/score:3.2795699899999993 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.8823509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:1.9540999999999997 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:96.0 - frontier/cluster_38/score:2.2228037989999994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4384876999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.7321299999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:3.103706392999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:48.0 - frontier/cluster_46/score:2.9176456999999996 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.7398999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:4.47100139 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.2319299999999993 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.0765509999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.462350999999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.1763509999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.9376456999999996 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:64.0 - frontier/cluster_58/score:2.5234456999999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:64.0 - frontier/cluster_59/score:3.3623519899999996 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.2678699999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.0864119899999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:319.0 - cluster/prob_snapshot/cluster_0:0.019338113372292836 - cluster/prob_snapshot/cluster_1:0.01704321728786637 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.020862684414884685 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.016488904125878643 - cluster/prob_snapshot/cluster_6:0.011020812178950758 - cluster/prob_snapshot/cluster_7:0.01913937274235336 - cluster/prob_snapshot/cluster_8:0.012250259276121769 - cluster/prob_snapshot/cluster_9:0.01860057766972101 - cluster/prob_snapshot/cluster_10:0.025615238577074234 - cluster/prob_snapshot/cluster_11:0.017345225575622684 - cluster/prob_snapshot/cluster_12:0.01445487837382955 - cluster/prob_snapshot/cluster_13:0.014850816346677922 - cluster/prob_snapshot/cluster_14:0.020437501143568487 - cluster/prob_snapshot/cluster_15:0.02046936472328819 - cluster/prob_snapshot/cluster_16:0.011050313926443596 - cluster/prob_snapshot/cluster_17:0.011704372693208622 - cluster/prob_snapshot/cluster_18:0.019754061139631626 - cluster/prob_snapshot/cluster_19:0.02111606341227387 - cluster/prob_snapshot/cluster_20:0.01849188702106319 - cluster/prob_snapshot/cluster_21:0.016504517279941303 - cluster/prob_snapshot/cluster_22:0.017974955494343435 - cluster/prob_snapshot/cluster_23:0.01769519950514106 - cluster/prob_snapshot/cluster_24:0.028300093970959065 - cluster/prob_snapshot/cluster_25:0.010433279821041058 - cluster/prob_snapshot/cluster_26:0.011726677199012417 - cluster/prob_snapshot/cluster_27:0.01633079289538786 - cluster/prob_snapshot/cluster_28:0.0156970281801819 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017892979365193552 - cluster/prob_snapshot/cluster_31:0.015475210529859807 - cluster/prob_snapshot/cluster_32:0.01682139963754428 - cluster/prob_snapshot/cluster_33:0.01811479701551564 - cluster/prob_snapshot/cluster_34:0.01445487837382955 - cluster/prob_snapshot/cluster_35:0.020611211010396254 - cluster/prob_snapshot/cluster_36:0.01811479701551564 - cluster/prob_snapshot/cluster_37:0.01228099036100014 - cluster/prob_snapshot/cluster_38:0.01396972111453533 - cluster/prob_snapshot/cluster_39:0.016698526886637005 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021609965866699523 - cluster/prob_snapshot/cluster_42:0.010885968903331033 - cluster/prob_snapshot/cluster_43:0.01880243173151411 - cluster/prob_snapshot/cluster_44:0.01950595583430096 - cluster/prob_snapshot/cluster_45:0.010684040537178363 - cluster/prob_snapshot/cluster_46:0.018336614665837726 - cluster/prob_snapshot/cluster_47:0.01811479701551564 - cluster/prob_snapshot/cluster_48:0.010934801253315665 - cluster/prob_snapshot/cluster_49:0.028099035348553413 - cluster/prob_snapshot/cluster_50:0.012804816299078535 - cluster/prob_snapshot/cluster_51:0.014027076821261468 - cluster/prob_snapshot/cluster_52:0.013050561800893095 - cluster/prob_snapshot/cluster_53:0.015475210529859807 - cluster/prob_snapshot/cluster_54:0.019962507555474723 - cluster/prob_snapshot/cluster_55:0.024747038988144746 - cluster/prob_snapshot/cluster_56:0.015158328172256826 - cluster/prob_snapshot/cluster_57:0.017345225575622684 - cluster/prob_snapshot/cluster_58:0.0158591742071579 - cluster/prob_snapshot/cluster_59:0.02113147350671902 - cluster/prob_snapshot/cluster_60:0.010684040537178363 - cluster/prob_snapshot/cluster_61:0.014252950007676875 - cluster/prob_snapshot/cluster_62:0.019397265185643137 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|████      | 320/800 [10:48:18<18:50:11, 141.27s/it]
[36m(TaskRunner pid=2823680)[0m step:320 - global_seqlen/min:351178 - global_seqlen/max:553189 - global_seqlen/minmax_diff:202011 - global_seqlen/balanced_min:462909 - global_seqlen/balanced_max:463060 - global_seqlen/mean:463004.25 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.1828113172086887) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0036213535349816084 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.03173732993309386) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009915707746586122) - actor/ppo_kl:np.float64(-0.0016513901431925622) - actor/pg_clipfrac_lower:np.float64(0.00011925318360302602) - actor/grad_norm:np.float64(0.5155939410130183) - perf/mfu/actor:np.float64(0.2516560020952445) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.41838455200195) - actor/lr:np.float64(1e-06) - training/global_step:320 - training/epoch:0 - critic/score/mean:0.6184210777282715 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6401416063308716 - critic/rewards/max:1.5485150814056396 - critic/rewards/min:-0.0641118586063385 - critic/advantages/mean:-0.04525292292237282 - critic/advantages/max:2.474519968032837 - critic/advantages/min:-2.47475266456604 - critic/returns/mean:-0.04525292292237282 - critic/returns/max:2.474519968032837 - critic/returns/min:-2.47475266456604 - response_length/mean:1248.7158203125 - response_length/max:8192.0 - response_length/min:107.0 - response_length/clip_ratio:0.031578946858644485 - response_length_non_aborted/mean:1248.7158203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:107.0 - response_length_non_aborted/clip_ratio:0.031578946858644485 - response/aborted_ratio:0.0 - prompt_length/mean:241.0736846923828 - prompt_length/max:373.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.950755000114441e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8791336137801409) - timing_s/agent_loop/generate_sequences/max:np.float64(34.017306834459305) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.989207597598579) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.017306834459305) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:194 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:35.582953782752156 - timing_s/reward:0.00011597201228141785 - timing_s/old_log_prob:13.100828820839524 - timing_s/ref:28.41276787687093 - timing_s/adv:0.08849168289452791 - timing_s/update_actor:22.191448252648115 - timing_s/update_weights:37.94417162332684 - timing_s/step:137.71507541276515 - timing_s/stop_profile:5.2782706916332245e-05 - timing_per_token_ms/adv:7.815629450869773e-05 - timing_per_token_ms/update_actor:0.01959959748167183 - timing_per_token_ms/gen:0.03749426124392234 - timing_per_token_ms/ref:0.02509429791993829 - perf/total_num_tokens:1852017 - perf/time_per_step:137.71507541276515 - perf/throughput:3362.0447769589864 - frontier/active_count:59.0 - frontier/completed_count:5.0 - frontier/blacklisted_count:767.0 - frontier/mean_score:2.7140834680655748 - frontier/mean_frontier_pct:0.3551971885784931 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:3.0769999999999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.798291989999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:64.0 - frontier/cluster_3/score:3.319583389999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:64.0 - frontier/cluster_5/score:2.736551989999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:3.0317640535099994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.9492101978589993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:64.0 - frontier/cluster_9/score:2.9596463929999994 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.07579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:80.0 - frontier/cluster_18/score:3.143183875099999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:3.3598999999999997 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9423519899999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:32.0 - frontier/cluster_21/score:2.62613 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.815586392999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:4.502993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.60613 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.5984876999999997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.6483519899999997 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.8470562999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.462350999999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.5176456999999997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:96.0 - frontier/cluster_35/score:3.7956989929999994 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.8823509999999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.2678699999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:1.8559626592999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4384876999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.7321299999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:3.103706392999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.54235199 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:1.7398999999999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:4.47100139 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.0765509999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:48.0 - frontier/cluster_53/score:2.462350999999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.1763509999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:64.0 - frontier/cluster_55/score:3.6563519899999997 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.0664119899999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.2536463929999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:48.0 - frontier/cluster_61/score:2.2678699999999994 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.0864119899999993 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:320.0 - cluster/prob_snapshot/cluster_0:0.0192155263412191 - cluster/prob_snapshot/cluster_1:0.017475025493749567 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.020730432912713162 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017089465988937454 - cluster/prob_snapshot/cluster_6:0.010950949694486557 - cluster/prob_snapshot/cluster_7:0.018933032834118495 - cluster/prob_snapshot/cluster_8:0.012172603152919242 - cluster/prob_snapshot/cluster_9:0.018482665981600782 - cluster/prob_snapshot/cluster_10:0.025452859963041082 - cluster/prob_snapshot/cluster_11:0.017235271741673903 - cluster/prob_snapshot/cluster_12:0.014363246858889808 - cluster/prob_snapshot/cluster_13:0.014756674925024615 - cluster/prob_snapshot/cluster_14:0.020307944938186752 - cluster/prob_snapshot/cluster_15:0.020339606530175698 - cluster/prob_snapshot/cluster_16:0.010980264426236596 - cluster/prob_snapshot/cluster_17:0.01163017702213056 - cluster/prob_snapshot/cluster_18:0.01962883735693181 - cluster/prob_snapshot/cluster_19:0.02098220570486255 - cluster/prob_snapshot/cluster_20:0.018374664338311163 - cluster/prob_snapshot/cluster_21:0.016399892814581 - cluster/prob_snapshot/cluster_22:0.014376173781062809 - cluster/prob_snapshot/cluster_23:0.017583027137039183 - cluster/prob_snapshot/cluster_24:0.028120695679501217 - cluster/prob_snapshot/cluster_25:0.010367141787149118 - cluster/prob_snapshot/cluster_26:0.0100301050771603 - cluster/prob_snapshot/cluster_27:0.016227269693429913 - cluster/prob_snapshot/cluster_28:0.016538666696348726 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.017779553242633667 - cluster/prob_snapshot/cluster_31:0.015377110985319204 - cluster/prob_snapshot/cluster_32:0.01671476641017755 - cluster/prob_snapshot/cluster_33:0.021967310239657496 - cluster/prob_snapshot/cluster_34:0.014363246858889808 - cluster/prob_snapshot/cluster_35:0.023703722451521067 - cluster/prob_snapshot/cluster_36:0.01799996475955126 - cluster/prob_snapshot/cluster_37:0.014162598545161054 - cluster/prob_snapshot/cluster_38:0.011590282537568478 - cluster/prob_snapshot/cluster_39:0.01659267256698705 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02147297724189402 - cluster/prob_snapshot/cluster_42:0.010816961209429914 - cluster/prob_snapshot/cluster_43:0.01868324046199579 - cluster/prob_snapshot/cluster_44:0.019382304826162374 - cluster/prob_snapshot/cluster_45:0.013051819971773782 - cluster/prob_snapshot/cluster_46:0.02212159830149981 - cluster/prob_snapshot/cluster_47:0.01799996475955126 - cluster/prob_snapshot/cluster_48:0.010865484004253205 - cluster/prob_snapshot/cluster_49:0.027920911596091072 - cluster/prob_snapshot/cluster_50:0.0127236447606079 - cluster/prob_snapshot/cluster_51:0.015377110985319204 - cluster/prob_snapshot/cluster_52:0.012967832446988906 - cluster/prob_snapshot/cluster_53:0.015377110985319204 - cluster/prob_snapshot/cluster_54:0.019835962401513696 - cluster/prob_snapshot/cluster_55:0.02283351575450565 - cluster/prob_snapshot/cluster_56:0.01506223738972265 - cluster/prob_snapshot/cluster_57:0.017235271741673903 - cluster/prob_snapshot/cluster_58:0.012904515445452057 - cluster/prob_snapshot/cluster_59:0.02031866362356322 - cluster/prob_snapshot/cluster_60:0.010616312895701164 - cluster/prob_snapshot/cluster_61:0.014162598545161054 - cluster/prob_snapshot/cluster_62:0.019274303182872755 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|████      | 321/800 [10:50:39<18:48:36, 141.37s/it]
[36m(TaskRunner pid=2823680)[0m step:321 - global_seqlen/min:428829 - global_seqlen/max:485180 - global_seqlen/minmax_diff:56351 - global_seqlen/balanced_min:454428 - global_seqlen/balanced_max:454484 - global_seqlen/mean:454466.0 - frontier/skipped_zero_acc_count:28.0 - actor/entropy:np.float64(0.17410874485969544) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.015926314517855644 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06673663786932593) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0010880169644224224) - actor/ppo_kl:np.float64(8.244069023248813e-05) - actor/pg_clipfrac_lower:np.float64(1.5741505412734112e-05) - actor/grad_norm:np.float64(0.4538692281796382) - perf/mfu/actor:np.float64(0.22501797812896468) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.56793975830078) - actor/lr:np.float64(1e-06) - training/global_step:321 - training/epoch:0 - critic/score/mean:0.706250011920929 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.7106151580810547 - critic/rewards/max:1.1502407789230347 - critic/rewards/min:-0.09203501790761948 - critic/advantages/mean:-0.2017432302236557 - critic/advantages/max:2.472966432571411 - critic/advantages/min:-2.4748036861419678 - critic/returns/mean:-0.2017432302236557 - critic/returns/max:2.472966432571411 - critic/returns/min:-2.4748036861419678 - response_length/mean:1327.5550537109375 - response_length/max:8192.0 - response_length/min:129.0 - response_length/clip_ratio:0.04749999940395355 - response_length_non_aborted/mean:1327.5550537109375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:129.0 - response_length_non_aborted/clip_ratio:0.04749999940395355 - response/aborted_ratio:0.0 - prompt_length/mean:228.89999389648438 - prompt_length/max:522.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.270571172237396e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.515367173589766) - timing_s/agent_loop/generate_sequences/max:np.float64(35.278065694496036) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.310484298695883) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.278065694496036) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:194 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.99407119676471 - timing_s/reward:0.00016964972019195557 - timing_s/old_log_prob:13.729553354904056 - timing_s/ref:27.845308498479426 - timing_s/adv:0.06829777453094721 - timing_s/update_actor:24.400051871314645 - timing_s/update_weights:37.930387964472175 - timing_s/step:141.34854358341545 - timing_s/stop_profile:7.004011422395706e-05 - timing_per_token_ms/adv:5.4850424948799685e-05 - timing_per_token_ms/update_actor:0.01959585393676226 - timing_per_token_ms/gen:0.034832898822237786 - timing_per_token_ms/ref:0.022362763859603577 - perf/total_num_tokens:1817864 - perf/time_per_step:141.34854358341545 - perf/throughput:3215.215300268031 - frontier/active_count:58.0 - frontier/completed_count:6.0 - frontier/blacklisted_count:795.0 - frontier/mean_score:2.747822938198947 - frontier/mean_frontier_pct:0.3608963681883991 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:0.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.6538999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.798291989999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.815586392999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:3.0317640535099994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.9492101978589993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.9717524750999993 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.07579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.2569999999999997 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.700228712569999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:3.2519299999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9423519899999997 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:2.7382909999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.815586392999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:80.0 - frontier/cluster_24/score:4.502993 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.60613 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.5984876999999997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.7538463929999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.8470562999999993 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:48.0 - frontier/cluster_31/score:2.462350999999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.5176456999999997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.3 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:4.156989295099999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.9176456999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.2678699999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:1.8559626592999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4384876999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:48.0 - frontier/cluster_42/score:1.7321299999999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:3.103706392999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.54235199 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.8823509999999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.1179299999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:80.0 - frontier/cluster_49/score:4.029700972999999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.0765509999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.623645699999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.1763509999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.4594463929999995 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:80.0 - frontier/cluster_58/score:2.0664119899999993 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.2536463929999995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:64.0 - frontier/cluster_62/score:3.060488392999999 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:321.0 - cluster/prob_snapshot/cluster_0:0.022926614006418114 - cluster/prob_snapshot/cluster_1:0.01755805039327338 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02022735092285753 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017666565158880657 - cluster/prob_snapshot/cluster_6:0.011002978316613619 - cluster/prob_snapshot/cluster_7:0.01902298481440579 - cluster/prob_snapshot/cluster_8:0.012230435924269182 - cluster/prob_snapshot/cluster_9:0.018646438577748527 - cluster/prob_snapshot/cluster_10:0.02557378803503623 - cluster/prob_snapshot/cluster_11:0.017317157556669134 - cluster/prob_snapshot/cluster_12:0.01443148751053988 - cluster/prob_snapshot/cluster_13:0.014826784777132925 - cluster/prob_snapshot/cluster_14:0.020404429208760844 - cluster/prob_snapshot/cluster_15:0.020436241226881906 - cluster/prob_snapshot/cluster_16:0.01103243232442131 - cluster/prob_snapshot/cluster_17:0.011685432694235411 - cluster/prob_snapshot/cluster_18:0.023217306283302173 - cluster/prob_snapshot/cluster_19:0.020404429208760844 - cluster/prob_snapshot/cluster_20:0.018461963476216155 - cluster/prob_snapshot/cluster_21:0.017181570594227716 - cluster/prob_snapshot/cluster_22:0.014444475849299363 - cluster/prob_snapshot/cluster_23:0.017666565158880657 - cluster/prob_snapshot/cluster_24:0.028254298799803696 - cluster/prob_snapshot/cluster_25:0.010416396702716197 - cluster/prob_snapshot/cluster_26:0.010077758711001486 - cluster/prob_snapshot/cluster_27:0.016304366429931084 - cluster/prob_snapshot/cluster_28:0.017279173837619472 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01786402497184951 - cluster/prob_snapshot/cluster_31:0.015450168566550161 - cluster/prob_snapshot/cluster_32:0.016794179272966527 - cluster/prob_snapshot/cluster_33:0.022071678254806222 - cluster/prob_snapshot/cluster_34:0.01443148751053988 - cluster/prob_snapshot/cluster_35:0.026083277866818962 - cluster/prob_snapshot/cluster_36:0.01830694238249147 - cluster/prob_snapshot/cluster_37:0.014229885904577421 - cluster/prob_snapshot/cluster_38:0.011645348668572315 - cluster/prob_snapshot/cluster_39:0.016671505354567153 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021574996651171734 - cluster/prob_snapshot/cluster_42:0.010868353244187582 - cluster/prob_snapshot/cluster_43:0.018772005663665523 - cluster/prob_snapshot/cluster_44:0.019474391324766204 - cluster/prob_snapshot/cluster_45:0.013113829955229715 - cluster/prob_snapshot/cluster_46:0.022226699348530908 - cluster/prob_snapshot/cluster_47:0.01808548367717049 - cluster/prob_snapshot/cluster_48:0.013289078410085964 - cluster/prob_snapshot/cluster_49:0.025284599679591253 - cluster/prob_snapshot/cluster_50:0.012784095563853024 - cluster/prob_snapshot/cluster_51:0.015450168566550161 - cluster/prob_snapshot/cluster_52:0.013029443400651777 - cluster/prob_snapshot/cluster_53:0.01646222180505724 - cluster/prob_snapshot/cluster_54:0.019930204254604716 - cluster/prob_snapshot/cluster_55:0.02170650322346162 - cluster/prob_snapshot/cluster_56:0.015133798987520193 - cluster/prob_snapshot/cluster_57:0.017317157556669134 - cluster/prob_snapshot/cluster_58:0.012965825576223847 - cluster/prob_snapshot/cluster_59:0.02041519881925766 - cluster/prob_snapshot/cluster_60:0.010666751638225128 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.01920321739988337 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|████      | 322/800 [10:53:08<19:04:51, 143.71s/it]
[36m(TaskRunner pid=2823680)[0m step:322 - global_seqlen/min:402916 - global_seqlen/max:495019 - global_seqlen/minmax_diff:92103 - global_seqlen/balanced_min:443955 - global_seqlen/balanced_max:444145 - global_seqlen/mean:444010.25 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.15869434494921503) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.016240769997239113 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.003876922659401316) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009018028308383555) - actor/ppo_kl:np.float64(0.0002116923037139249) - actor/pg_clipfrac_lower:np.float64(2.0505831866528828e-05) - actor/grad_norm:np.float64(0.43855031083027524) - perf/mfu/actor:np.float64(0.21895142102479545) - perf/max_memory_allocated_gb:np.float64(95.01034832000732) - perf/max_memory_reserved_gb:np.float64(101.064453125) - perf/cpu_memory_used_gb:np.float64(104.91222763061523) - actor/lr:np.float64(1e-06) - training/global_step:322 - training/epoch:0 - critic/score/mean:0.6609042286872864 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6698336601257324 - critic/rewards/max:1.3188183307647705 - critic/rewards/min:-0.0876525416970253 - critic/advantages/mean:-0.16517551243305206 - critic/advantages/max:2.4723868370056152 - critic/advantages/min:-2.474799871444702 - critic/returns/mean:-0.16517551243305206 - critic/returns/max:2.4723868370056152 - critic/returns/min:-2.474799871444702 - response_length/mean:1391.223388671875 - response_length/max:8192.0 - response_length/min:149.0 - response_length/clip_ratio:0.05585106462240219 - response_length_non_aborted/mean:1391.223388671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:149.0 - response_length_non_aborted/clip_ratio:0.05585106462240219 - response/aborted_ratio:0.0 - prompt_length/mean:235.1702117919922 - prompt_length/max:641.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.0001354515552520752 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2578392419964075) - timing_s/agent_loop/generate_sequences/max:np.float64(34.5391318006441) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.760546684609835) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.5391318006441) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:172 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.19976203981787 - timing_s/reward:0.00019340869039297104 - timing_s/old_log_prob:13.57734292279929 - timing_s/ref:32.22030827496201 - timing_s/adv:0.08477251045405865 - timing_s/update_actor:24.42566192150116 - timing_s/update_weights:40.93839394953102 - timing_s/step:148.93305652681738 - timing_s/stop_profile:5.325954407453537e-05 - timing_per_token_ms/adv:6.931249669191941e-05 - timing_per_token_ms/update_actor:0.01997113925332543 - timing_per_token_ms/gen:0.03555702737508877 - timing_per_token_ms/ref:0.026344271259150915 - perf/total_num_tokens:1776041 - perf/time_per_step:148.93305652681738 - perf/throughput:2981.2740056137236 - frontier/active_count:57.0 - frontier/completed_count:7.0 - frontier/blacklisted_count:829.0 - frontier/mean_score:2.7284229330002625 - frontier/mean_frontier_pct:0.3633924832247227 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:32.0 - frontier/cluster_0/score:3.6538999999999997 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:64.0 - frontier/cluster_1/score:2.798291989999999 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.815586392999999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:3.0222348374569994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:112.0 - frontier/cluster_8/score:1.9492101978589993 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.9717524750999993 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.07579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:32.0 - frontier/cluster_14/score:3.2519299999999993 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:32.0 - frontier/cluster_15/score:3.1798999999999995 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.490160098798999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:3.2519299999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:2.7382909999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.815586392999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:4.6520951 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.60613 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:48.0 - frontier/cluster_27/score:2.5984876999999997 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.7538463929999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.8929394099999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.623645699999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.5176456999999997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.51 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:4.156989295099999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.9176456999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:48.0 - frontier/cluster_37/score:2.2678699999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:1.8559626592999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:48.0 - frontier/cluster_41/score:3.4384876999999996 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.5124909999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:3.103706392999999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.54235199 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:48.0 - frontier/cluster_47/score:2.9176456999999996 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.1179299999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.720790681099999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:64.0 - frontier/cluster_52/score:2.0765509999999994 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.623645699999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.1234456999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.3216124750999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.7598999999999996 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:1.7464883929999995 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.1775524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:322.0 - cluster/prob_snapshot/cluster_0:0.02349471117420917 - cluster/prob_snapshot/cluster_1:0.017993120251280276 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02072856321560928 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.018104323897634955 - cluster/prob_snapshot/cluster_6:0.011275620444107227 - cluster/prob_snapshot/cluster_7:0.019433080983794083 - cluster/prob_snapshot/cluster_8:0.012533493148832835 - cluster/prob_snapshot/cluster_9:0.01910847754008476 - cluster/prob_snapshot/cluster_10:0.02620747936635649 - cluster/prob_snapshot/cluster_11:0.01774625834579487 - cluster/prob_snapshot/cluster_12:0.014789084457889129 - cluster/prob_snapshot/cluster_13:0.015194176771300872 - cluster/prob_snapshot/cluster_14:0.020910029313540603 - cluster/prob_snapshot/cluster_15:0.02044687376853984 - cluster/prob_snapshot/cluster_16:0.011305804290974926 - cluster/prob_snapshot/cluster_17:0.011974985317058378 - cluster/prob_snapshot/cluster_18:0.022441857596823073 - cluster/prob_snapshot/cluster_19:0.020910029313540603 - cluster/prob_snapshot/cluster_20:0.019030634987636484 - cluster/prob_snapshot/cluster_21:0.01760731168229464 - cluster/prob_snapshot/cluster_22:0.014802394633901227 - cluster/prob_snapshot/cluster_23:0.018104323897634955 - cluster/prob_snapshot/cluster_24:0.02991314232175312 - cluster/prob_snapshot/cluster_25:0.01067450396023554 - cluster/prob_snapshot/cluster_26:0.010327474878412813 - cluster/prob_snapshot/cluster_27:0.016708371329602636 - cluster/prob_snapshot/cluster_28:0.017707333430491447 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01860170663741128 - cluster/prob_snapshot/cluster_31:0.016870138193425057 - cluster/prob_snapshot/cluster_32:0.017210321215151127 - cluster/prob_snapshot/cluster_33:0.022618591021839357 - cluster/prob_snapshot/cluster_34:0.016139392169261614 - cluster/prob_snapshot/cluster_35:0.026729593815554297 - cluster/prob_snapshot/cluster_36:0.01876056898934654 - cluster/prob_snapshot/cluster_37:0.014582487378049135 - cluster/prob_snapshot/cluster_38:0.011933908051772262 - cluster/prob_snapshot/cluster_39:0.017084607567222354 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02210960217509258 - cluster/prob_snapshot/cluster_42:0.00972537266991182 - cluster/prob_snapshot/cluster_43:0.019237156045152325 - cluster/prob_snapshot/cluster_44:0.019956946077638007 - cluster/prob_snapshot/cluster_45:0.013438776746516643 - cluster/prob_snapshot/cluster_46:0.02277745337377462 - cluster/prob_snapshot/cluster_47:0.01876056898934654 - cluster/prob_snapshot/cluster_48:0.013618367672129181 - cluster/prob_snapshot/cluster_49:0.023924820709962914 - cluster/prob_snapshot/cluster_50:0.013100871886800754 - cluster/prob_snapshot/cluster_51:0.015833007349551193 - cluster/prob_snapshot/cluster_52:0.013352299182658314 - cluster/prob_snapshot/cluster_53:0.016870138193425057 - cluster/prob_snapshot/cluster_54:0.020083870546491578 - cluster/prob_snapshot/cluster_55:0.021358090187231324 - cluster/prob_snapshot/cluster_56:0.015508798468050661 - cluster/prob_snapshot/cluster_57:0.01774625834579487 - cluster/prob_snapshot/cluster_58:0.011229984499478284 - cluster/prob_snapshot/cluster_59:0.020431779097229798 - cluster/prob_snapshot/cluster_60:0.010931062425396312 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|████      | 323/800 [10:55:53<19:51:51, 149.92s/it]
[36m(TaskRunner pid=2823680)[0m step:323 - global_seqlen/min:414148 - global_seqlen/max:600967 - global_seqlen/minmax_diff:186819 - global_seqlen/balanced_min:501589 - global_seqlen/balanced_max:501761 - global_seqlen/mean:501671.25 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.16985039240292585) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01461553294211626 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.029147602035664022) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0017969260776832623) - actor/ppo_kl:np.float64(0.001679183459036911) - actor/pg_clipfrac_lower:np.float64(0.00018200888490582656) - actor/grad_norm:np.float64(0.43225911259651184) - perf/mfu/actor:np.float64(0.23680705898349855) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(104.40519714355469) - actor/lr:np.float64(1e-06) - training/global_step:323 - training/epoch:0 - critic/score/mean:0.6171875 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6316128373146057 - critic/rewards/max:1.5560694932937622 - critic/rewards/min:-0.11285603046417236 - critic/advantages/mean:-0.18286779522895813 - critic/advantages/max:2.4733331203460693 - critic/advantages/min:-2.474270820617676 - critic/returns/mean:-0.18286779522895813 - critic/returns/max:2.4733331203460693 - critic/returns/min:-2.474270820617676 - response_length/mean:1569.19921875 - response_length/max:8192.0 - response_length/min:154.0 - response_length/clip_ratio:0.0716145858168602 - response_length_non_aborted/mean:1569.19921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:154.0 - response_length_non_aborted/clip_ratio:0.0716145858168602 - response/aborted_ratio:0.0 - prompt_length/mean:238.9895782470703 - prompt_length/max:620.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010587647557258606 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.173169631510973) - timing_s/agent_loop/generate_sequences/max:np.float64(38.46579866204411) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.786956150675906) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(38.46579866204411) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.69019711203873 - timing_s/reward:0.00017688609659671783 - timing_s/old_log_prob:14.677227179519832 - timing_s/ref:37.3020410137251 - timing_s/adv:0.07735136337578297 - timing_s/update_actor:25.742695262655616 - timing_s/update_weights:44.839236522093415 - timing_s/step:163.76002008188516 - timing_s/stop_profile:5.188398063182831e-05 - timing_per_token_ms/adv:5.570099811821291e-05 - timing_per_token_ms/update_actor:0.01853740849294235 - timing_per_token_ms/gen:0.03376373557707888 - timing_per_token_ms/ref:0.02686133541327475 - perf/total_num_tokens:2006685 - perf/time_per_step:163.76002008188516 - perf/throughput:3063.4537645339115 - frontier/active_count:57.0 - frontier/completed_count:7.0 - frontier/blacklisted_count:861.0 - frontier/mean_score:2.7195197689641626 - frontier/mean_frontier_pct:0.38116874281438956 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:4.057729999999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.2588043929999992 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.8709104750999987 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:96.0 - frontier/cluster_7/score:3.0222348374569994 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:128.0 - frontier/cluster_8/score:1.6644471385012996 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.9717524750999993 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.07579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:3.1763509999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.1259299999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.490160098798999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:32.0 - frontier/cluster_19/score:3.2519299999999993 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:2.7382909999999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.815586392999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:96.0 - frontier/cluster_24/score:4.6520951 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:48.0 - frontier/cluster_26/score:1.60613 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.7189413899999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.7538463929999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:64.0 - frontier/cluster_30/score:2.8929394099999994 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.623645699999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.5176456999999997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:4.156989295099999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:48.0 - frontier/cluster_36/score:2.9176456999999996 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:1.8559626592999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.3069413899999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.5124909999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:3.072594475099999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.54235199 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9423519899999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.1179299999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.720790681099999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.736551989999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:48.0 - frontier/cluster_54/score:3.1234456999999995 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.3216124750999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:32.0 - frontier/cluster_57/score:2.8319299999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:96.0 - frontier/cluster_58/score:2.1225418750999996 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.1775524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:323.0 - cluster/prob_snapshot/cluster_0:0.02617677077638967 - cluster/prob_snapshot/cluster_1:0.014571744503518716 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02079642438751447 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01852049432225127 - cluster/prob_snapshot/cluster_6:0.01131253457121465 - cluster/prob_snapshot/cluster_7:0.01949670100586564 - cluster/prob_snapshot/cluster_8:0.010737493922455717 - cluster/prob_snapshot/cluster_9:0.019171034875376466 - cluster/prob_snapshot/cluster_10:0.0262932774143921 - cluster/prob_snapshot/cluster_11:0.01780435604777988 - cluster/prob_snapshot/cluster_12:0.01483750096376453 - cluster/prob_snapshot/cluster_13:0.015243919468424166 - cluster/prob_snapshot/cluster_14:0.020490917836414966 - cluster/prob_snapshot/cluster_15:0.02016564755985237 - cluster/prob_snapshot/cluster_16:0.011342817233962627 - cluster/prob_snapshot/cluster_17:0.012014189024942536 - cluster/prob_snapshot/cluster_18:0.022515327752009852 - cluster/prob_snapshot/cluster_19:0.020978484569171646 - cluster/prob_snapshot/cluster_20:0.019092937481973788 - cluster/prob_snapshot/cluster_21:0.017664954500681625 - cluster/prob_snapshot/cluster_22:0.014850854714631916 - cluster/prob_snapshot/cluster_23:0.01816359383465208 - cluster/prob_snapshot/cluster_24:0.03001107196946707 - cluster/prob_snapshot/cluster_25:0.010709450152150217 - cluster/prob_snapshot/cluster_26:0.010361284966491795 - cluster/prob_snapshot/cluster_27:0.017540128475888813 - cluster/prob_snapshot/cluster_28:0.01776530370008564 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.018662604906081473 - cluster/prob_snapshot/cluster_31:0.016925367653185502 - cluster/prob_snapshot/cluster_32:0.017266664366115177 - cluster/prob_snapshot/cluster_33:0.022692639766927022 - cluster/prob_snapshot/cluster_34:0.017140539156835805 - cluster/prob_snapshot/cluster_35:0.026817101161915248 - cluster/prob_snapshot/cluster_36:0.018821987341597145 - cluster/prob_snapshot/cluster_37:0.012176485481136617 - cluster/prob_snapshot/cluster_38:0.011972977280902054 - cluster/prob_snapshot/cluster_39:0.017140539156835805 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02133336785271209 - cluster/prob_snapshot/cluster_42:0.009757211595732685 - cluster/prob_snapshot/cluster_43:0.01930013464814416 - cluster/prob_snapshot/cluster_44:0.019821575428501655 - cluster/prob_snapshot/cluster_45:0.013482772614899073 - cluster/prob_snapshot/cluster_46:0.022852022202442698 - cluster/prob_snapshot/cluster_47:0.01898136977711282 - cluster/prob_snapshot/cluster_48:0.013662951485298176 - cluster/prob_snapshot/cluster_49:0.024003145789907532 - cluster/prob_snapshot/cluster_50:0.013143761522226552 - cluster/prob_snapshot/cluster_51:0.01588484145027241 - cluster/prob_snapshot/cluster_52:0.01131253457121465 - cluster/prob_snapshot/cluster_53:0.017653735995224663 - cluster/prob_snapshot/cluster_54:0.020149621123485294 - cluster/prob_snapshot/cluster_55:0.021428012304586318 - cluster/prob_snapshot/cluster_56:0.015559571173709817 - cluster/prob_snapshot/cluster_57:0.018269027871440732 - cluster/prob_snapshot/cluster_58:0.013692703094533399 - cluster/prob_snapshot/cluster_59:0.020498668657264615 - cluster/prob_snapshot/cluster_60:0.010966848538434654 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  40%|████      | 324/800 [10:58:26<19:57:11, 150.91s/it]
[36m(TaskRunner pid=2823680)[0m step:324 - global_seqlen/min:433008 - global_seqlen/max:560057 - global_seqlen/minmax_diff:127049 - global_seqlen/balanced_min:492623 - global_seqlen/balanced_max:493141 - global_seqlen/mean:492867.25 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.16203520237943347) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.015415354631841183 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02550776338466676) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011206398961877888) - actor/ppo_kl:np.float64(0.0006883927251353772) - actor/pg_clipfrac_lower:np.float64(2.4535600608274132e-05) - actor/grad_norm:np.float64(0.4618793852054156) - perf/mfu/actor:np.float64(0.2508072483794894) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(104.97262573242188) - actor/lr:np.float64(1e-06) - training/global_step:324 - training/epoch:0 - critic/score/mean:0.6313775777816772 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6466708183288574 - critic/rewards/max:1.6095887422561646 - critic/rewards/min:-0.11503283679485321 - critic/advantages/mean:-0.160878986120224 - critic/advantages/max:2.474144458770752 - critic/advantages/min:-2.4748129844665527 - critic/returns/mean:-0.160878986120224 - critic/returns/max:2.474144458770752 - critic/returns/min:-2.4748129844665527 - response_length/mean:1547.455322265625 - response_length/max:8192.0 - response_length/min:192.0 - response_length/clip_ratio:0.06632652878761292 - response_length_non_aborted/mean:1547.455322265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:192.0 - response_length_non_aborted/clip_ratio:0.06632652878761292 - response/aborted_ratio:0.0 - prompt_length/mean:243.08163452148438 - prompt_length/max:458.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.736085146665573e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5136323617771268) - timing_s/agent_loop/generate_sequences/max:np.float64(39.105735153891146) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.755043546108027) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(39.105735153891146) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.667779297567904 - timing_s/reward:0.00013054907321929932 - timing_s/old_log_prob:12.659172841347754 - timing_s/ref:32.58929122239351 - timing_s/adv:0.09517623577266932 - timing_s/update_actor:23.82114014774561 - timing_s/update_weights:42.24618044588715 - timing_s/step:152.50661086943 - timing_s/stop_profile:5.291495472192764e-05 - timing_per_token_ms/adv:6.779991734655856e-05 - timing_per_token_ms/update_actor:0.016969270953051514 - timing_per_token_ms/gen:0.03352094600464712 - timing_per_token_ms/ref:0.02321536708531709 - perf/total_num_tokens:1971469 - perf/time_per_step:152.50661086943 - perf/throughput:3231.776296058228 - frontier/active_count:55.0 - frontier/completed_count:9.0 - frontier/blacklisted_count:891.0 - frontier/mean_score:2.683915698538905 - frontier/mean_frontier_pct:0.3751252482861921 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:48.0 - frontier/cluster_0/score:4.057729999999999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.4811630750999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.8709104750999987 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:3.0155643862198995 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:1.4651129969509096 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.9717524750999993 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.07579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.3 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:3.1763509999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.0881509999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.490160098798999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.1763509999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:2.8168036999999995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.815586392999999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:64.0 - frontier/cluster_26/score:1.424291 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.7189413899999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:64.0 - frontier/cluster_28/score:2.7538463929999994 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.9250575869999995 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.623645699999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.5176456999999997 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.809892506569999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.9423519899999997 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:1.8875089999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:1.8559626592999996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.3069413899999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.5124909999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:80.0 - frontier/cluster_44/score:3.072594475099999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.379646393 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9596463929999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:48.0 - frontier/cluster_48/score:2.1179299999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.720790681099999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:64.0 - frontier/cluster_53/score:2.736551989999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.0864119899999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.3216124750999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:48.0 - frontier/cluster_57/score:2.8823509999999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.1775524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:324.0 - cluster/prob_snapshot/cluster_0:0.027488534431641218 - cluster/prob_snapshot/cluster_1:0.016808298388607215 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021838569497916464 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01944858860617778 - cluster/prob_snapshot/cluster_6:0.011879425416004431 - cluster/prob_snapshot/cluster_7:0.020428526654419278 - cluster/prob_snapshot/cluster_8:0.009925206719749747 - cluster/prob_snapshot/cluster_9:0.0201317288814439 - cluster/prob_snapshot/cluster_10:0.027610879420547685 - cluster/prob_snapshot/cluster_11:0.018696563393297876 - cluster/prob_snapshot/cluster_12:0.015581034024633183 - cluster/prob_snapshot/cluster_13:0.016007818869655743 - cluster/prob_snapshot/cluster_14:0.021517753480512013 - cluster/prob_snapshot/cluster_15:0.020920254697480427 - cluster/prob_snapshot/cluster_16:0.011911225595818068 - cluster/prob_snapshot/cluster_17:0.01261624099861288 - cluster/prob_snapshot/cluster_18:0.023643610109045355 - cluster/prob_snapshot/cluster_19:0.021517753480512013 - cluster/prob_snapshot/cluster_20:0.020049717891398203 - cluster/prob_snapshot/cluster_21:0.019082049691483752 - cluster/prob_snapshot/cluster_22:0.015595056955255352 - cluster/prob_snapshot/cluster_23:0.01907380321244661 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.011246119384475456 - cluster/prob_snapshot/cluster_26:0.00964866370955601 - cluster/prob_snapshot/cluster_27:0.01841909491677106 - cluster/prob_snapshot/cluster_28:0.018655554064324502 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.019815400776981927 - cluster/prob_snapshot/cluster_31:0.01777352735664458 - cluster/prob_snapshot/cluster_32:0.018131927043333024 - cluster/prob_snapshot/cluster_33:0.023829807538393307 - cluster/prob_snapshot/cluster_34:0.017999481479761027 - cluster/prob_snapshot/cluster_35:0.025809593380461807 - cluster/prob_snapshot/cluster_36:0.019932559334190067 - cluster/prob_snapshot/cluster_37:0.01278667041339189 - cluster/prob_snapshot/cluster_38:0.012572964062174773 - cluster/prob_snapshot/cluster_39:0.017999481479761027 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02240242013698163 - cluster/prob_snapshot/cluster_42:0.010246162492587594 - cluster/prob_snapshot/cluster_43:0.02026729806907046 - cluster/prob_snapshot/cluster_44:0.02081486915671001 - cluster/prob_snapshot/cluster_45:0.01415841787455798 - cluster/prob_snapshot/cluster_46:0.02289495019154861 - cluster/prob_snapshot/cluster_47:0.020049717891398203 - cluster/prob_snapshot/cluster_48:0.014347625822517978 - cluster/prob_snapshot/cluster_49:0.025205985304677027 - cluster/prob_snapshot/cluster_50:0.013802418344545583 - cluster/prob_snapshot/cluster_51:0.016680858570256318 - cluster/prob_snapshot/cluster_52:0.011879425416004431 - cluster/prob_snapshot/cluster_53:0.018538395507116362 - cluster/prob_snapshot/cluster_54:0.020908474013141654 - cluster/prob_snapshot/cluster_55:0.022501807387469187 - cluster/prob_snapshot/cluster_56:0.016339288432623264 - cluster/prob_snapshot/cluster_57:0.01952609087040673 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.021525892708517597 - cluster/prob_snapshot/cluster_60:0.011516416452989744 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_325
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 325}
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  41%|████      | 325/800 [11:04:44<28:52:49, 218.88s/it]
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(TaskRunner pid=2823680)[0m step:325 - global_seqlen/min:435014 - global_seqlen/max:577863 - global_seqlen/minmax_diff:142849 - global_seqlen/balanced_min:515571 - global_seqlen/balanced_max:515767 - global_seqlen/mean:515667.5 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.1606660542766685) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01596938632428646 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05897753007593565) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0027702090867577404) - actor/ppo_kl:np.float64(0.00769520486318984) - actor/pg_clipfrac_lower:np.float64(0.00019919444398312962) - actor/grad_norm:np.float64(0.4141726916035016) - perf/mfu/actor:np.float64(0.25948585425160337) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(104.3890609741211) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.08333333333333333) - val-aux/aime2024/reward/std@16:np.float64(0.11659183334901577) - val-aux/aime2024/reward/best@2/mean:np.float64(0.13446666666666665) - val-aux/aime2024/reward/best@2/std:np.float64(0.122040495842669) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.03056666666666667) - val-aux/aime2024/reward/worst@2/std:np.float64(0.07523037196254681) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.0821) - val-aux/aime2024/reward/maj@2/std:np.float64(0.1154999874617456) - val-aux/aime2024/reward/best@4/mean:np.float64(0.19136666666666666) - val-aux/aime2024/reward/best@4/std:np.float64(0.10454316014916291) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.0058000000000000005) - val-aux/aime2024/reward/worst@4/std:np.float64(0.026230858014442294) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.09926666666666668) - val-aux/aime2024/reward/maj@4/std:np.float64(0.11103594283102389) - val-aux/aime2024/reward/best@8/mean:np.float64(0.23463333333333333) - val-aux/aime2024/reward/best@8/std:np.float64(0.0667135053520526) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.00030000000000000003) - val-aux/aime2024/reward/worst@8/std:np.float64(0.004268309106525206) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.11620000000000001) - val-aux/aime2024/reward/maj@8/std:np.float64(0.0977065008734582) - val-aux/aime2024/reward/best@16/mean:np.float64(0.25853333333333334) - val-aux/aime2024/reward/best@16/std:np.float64(0.029489357424384705) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2024/reward/worst@16/std:np.float64(0.0) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.1278) - val-aux/aime2024/reward/maj@16/std:np.float64(0.07561550790664848) - val-aux/aime2024/score/mean@16:np.float64(0.08333333333333333) - val-aux/aime2024/score/std@16:np.float64(0.11659183334901577) - val-aux/aime2024/score/best@2/mean:np.float64(0.13446666666666665) - val-aux/aime2024/score/best@2/std:np.float64(0.122040495842669) - val-aux/aime2024/score/worst@2/mean:np.float64(0.03056666666666667) - val-aux/aime2024/score/worst@2/std:np.float64(0.07523037196254681) - val-aux/aime2024/score/maj@2/mean:np.float64(0.0821) - val-aux/aime2024/score/maj@2/std:np.float64(0.1154999874617456) - val-aux/aime2024/score/best@4/mean:np.float64(0.19136666666666666) - val-aux/aime2024/score/best@4/std:np.float64(0.10454316014916291) - val-aux/aime2024/score/worst@4/mean:np.float64(0.0058000000000000005) - val-aux/aime2024/score/worst@4/std:np.float64(0.026230858014442294) - val-aux/aime2024/score/maj@4/mean:np.float64(0.09926666666666668) - val-aux/aime2024/score/maj@4/std:np.float64(0.11103594283102389) - val-aux/aime2024/score/best@8/mean:np.float64(0.23463333333333333) - val-aux/aime2024/score/best@8/std:np.float64(0.0667135053520526) - val-aux/aime2024/score/worst@8/mean:np.float64(0.00030000000000000003) - val-aux/aime2024/score/worst@8/std:np.float64(0.004268309106525206) - val-aux/aime2024/score/maj@8/mean:np.float64(0.11620000000000001) - val-aux/aime2024/score/maj@8/std:np.float64(0.0977065008734582) - val-aux/aime2024/score/best@16/mean:np.float64(0.25853333333333334) - val-aux/aime2024/score/best@16/std:np.float64(0.029489357424384705) - val-aux/aime2024/score/worst@16/mean:np.float64(0.0) - val-aux/aime2024/score/worst@16/std:np.float64(0.0) - val-aux/aime2024/score/maj@16/mean:np.float64(0.1278) - val-aux/aime2024/score/maj@16/std:np.float64(0.07561550790664848) - val-core/aime2024/acc/mean@16:np.float64(0.08333333333333333) - val-aux/aime2024/acc/std@16:np.float64(0.11659183334901577) - val-aux/aime2024/acc/best@2/mean:np.float64(0.13446666666666665) - val-aux/aime2024/acc/best@2/std:np.float64(0.122040495842669) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.03056666666666667) - val-aux/aime2024/acc/worst@2/std:np.float64(0.07523037196254681) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.0821) - val-aux/aime2024/acc/maj@2/std:np.float64(0.1154999874617456) - val-aux/aime2024/acc/best@4/mean:np.float64(0.19136666666666666) - val-aux/aime2024/acc/best@4/std:np.float64(0.10454316014916291) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.0058000000000000005) - val-aux/aime2024/acc/worst@4/std:np.float64(0.026230858014442294) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.09926666666666668) - val-aux/aime2024/acc/maj@4/std:np.float64(0.11103594283102389) - val-aux/aime2024/acc/best@8/mean:np.float64(0.23463333333333333) - val-aux/aime2024/acc/best@8/std:np.float64(0.0667135053520526) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.00030000000000000003) - val-aux/aime2024/acc/worst@8/std:np.float64(0.004268309106525206) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.11620000000000001) - val-aux/aime2024/acc/maj@8/std:np.float64(0.0977065008734582) - val-core/aime2024/acc/best@16/mean:np.float64(0.25853333333333334) - val-core/aime2024/acc/best@16/std:np.float64(0.029489357424384705) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2024/acc/worst@16/std:np.float64(0.0) - val-core/aime2024/acc/maj@16/mean:np.float64(0.1278) - val-core/aime2024/acc/maj@16/std:np.float64(0.07561550790664848) - val-aux/aime2025/reward/mean@16:np.float64(0.06041666666666667) - val-aux/aime2025/reward/std@16:np.float64(0.0918211299175876) - val-aux/aime2025/reward/best@2/mean:np.float64(0.09736666666666667) - val-aux/aime2025/reward/best@2/std:np.float64(0.10005625158927596) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.02163333333333333) - val-aux/aime2025/reward/worst@2/std:np.float64(0.05652056991149442) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.05846666666666667) - val-aux/aime2025/reward/maj@2/std:np.float64(0.09157997492700072) - val-aux/aime2025/reward/best@4/mean:np.float64(0.14256666666666667) - val-aux/aime2025/reward/best@4/std:np.float64(0.0914411049241623) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.0033666666666666662) - val-aux/aime2025/reward/worst@4/std:np.float64(0.021257988243412563) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.06713333333333334) - val-aux/aime2025/reward/maj@4/std:np.float64(0.09023409187992429) - val-aux/aime2025/reward/best@8/mean:np.float64(0.17966666666666667) - val-aux/aime2025/reward/best@8/std:np.float64(0.06592044034106512) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0001) - val-aux/aime2025/reward/worst@8/std:np.float64(0.0025427859021978525) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.07436666666666666) - val-aux/aime2025/reward/maj@8/std:np.float64(0.08433171907768597) - val-aux/aime2025/reward/best@16/mean:np.float64(0.20646666666666666) - val-aux/aime2025/reward/best@16/std:np.float64(0.043252253371853185) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.08046666666666667) - val-aux/aime2025/reward/maj@16/std:np.float64(0.07302737628988787) - val-aux/aime2025/score/mean@16:np.float64(0.06041666666666667) - val-aux/aime2025/score/std@16:np.float64(0.0918211299175876) - val-aux/aime2025/score/best@2/mean:np.float64(0.09736666666666667) - val-aux/aime2025/score/best@2/std:np.float64(0.10005625158927596) - val-aux/aime2025/score/worst@2/mean:np.float64(0.02163333333333333) - val-aux/aime2025/score/worst@2/std:np.float64(0.05652056991149442) - val-aux/aime2025/score/maj@2/mean:np.float64(0.05846666666666667) - val-aux/aime2025/score/maj@2/std:np.float64(0.09157997492700072) - val-aux/aime2025/score/best@4/mean:np.float64(0.14256666666666667) - val-aux/aime2025/score/best@4/std:np.float64(0.0914411049241623) - val-aux/aime2025/score/worst@4/mean:np.float64(0.0033666666666666662) - val-aux/aime2025/score/worst@4/std:np.float64(0.021257988243412563) - val-aux/aime2025/score/maj@4/mean:np.float64(0.06713333333333334) - val-aux/aime2025/score/maj@4/std:np.float64(0.09023409187992429) - val-aux/aime2025/score/best@8/mean:np.float64(0.17966666666666667) - val-aux/aime2025/score/best@8/std:np.float64(0.06592044034106512) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0001) - val-aux/aime2025/score/worst@8/std:np.float64(0.0025427859021978525) - val-aux/aime2025/score/maj@8/mean:np.float64(0.07436666666666666) - val-aux/aime2025/score/maj@8/std:np.float64(0.08433171907768597) - val-aux/aime2025/score/best@16/mean:np.float64(0.20646666666666666) - val-aux/aime2025/score/best@16/std:np.float64(0.043252253371853185) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.08046666666666667) - val-aux/aime2025/score/maj@16/std:np.float64(0.07302737628988787) - val-core/aime2025/acc/mean@16:np.float64(0.06041666666666667) - val-aux/aime2025/acc/std@16:np.float64(0.0918211299175876) - val-aux/aime2025/acc/best@2/mean:np.float64(0.09736666666666667) - val-aux/aime2025/acc/best@2/std:np.float64(0.10005625158927596) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.02163333333333333) - val-aux/aime2025/acc/worst@2/std:np.float64(0.05652056991149442) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.05846666666666667) - val-aux/aime2025/acc/maj@2/std:np.float64(0.09157997492700072) - val-aux/aime2025/acc/best@4/mean:np.float64(0.14256666666666667) - val-aux/aime2025/acc/best@4/std:np.float64(0.0914411049241623) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.0033666666666666662) - val-aux/aime2025/acc/worst@4/std:np.float64(0.021257988243412563) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.06713333333333334) - val-aux/aime2025/acc/maj@4/std:np.float64(0.09023409187992429) - val-aux/aime2025/acc/best@8/mean:np.float64(0.17966666666666667) - val-aux/aime2025/acc/best@8/std:np.float64(0.06592044034106512) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0001) - val-aux/aime2025/acc/worst@8/std:np.float64(0.0025427859021978525) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.07436666666666666) - val-aux/aime2025/acc/maj@8/std:np.float64(0.08433171907768597) - val-core/aime2025/acc/best@16/mean:np.float64(0.20646666666666666) - val-core/aime2025/acc/best@16/std:np.float64(0.043252253371853185) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.08046666666666667) - val-core/aime2025/acc/maj@16/std:np.float64(0.07302737628988787) - val-aux/math500/reward/mean@4:np.float64(0.7035) - val-aux/math500/reward/std@4:np.float64(0.1304685657822283) - val-aux/math500/reward/best@2/mean:np.float64(0.762022) - val-aux/math500/reward/best@2/std:np.float64(0.10476828312470628) - val-aux/math500/reward/worst@2/mean:np.float64(0.644648) - val-aux/math500/reward/worst@2/std:np.float64(0.12001357411162539) - val-aux/math500/reward/maj@2/mean:np.float64(0.7034580000000001) - val-aux/math500/reward/maj@2/std:np.float64(0.1306690433222314) - val-aux/math500/reward/best@4/mean:np.float64(0.80322) - val-aux/math500/reward/best@4/std:np.float64(0.06148540425604767) - val-aux/math500/reward/worst@4/mean:np.float64(0.59287) - val-aux/math500/reward/worst@4/std:np.float64(0.08751340725605453) - val-aux/math500/reward/maj@4/mean:np.float64(0.7180019999999999) - val-aux/math500/reward/maj@4/std:np.float64(0.11971755357897815) - val-aux/math500/score/mean@4:np.float64(0.7035) - val-aux/math500/score/std@4:np.float64(0.1304685657822283) - val-aux/math500/score/best@2/mean:np.float64(0.762022) - val-aux/math500/score/best@2/std:np.float64(0.10476828312470628) - val-aux/math500/score/worst@2/mean:np.float64(0.644648) - val-aux/math500/score/worst@2/std:np.float64(0.12001357411162539) - val-aux/math500/score/maj@2/mean:np.float64(0.7034580000000001) - val-aux/math500/score/maj@2/std:np.float64(0.1306690433222314) - val-aux/math500/score/best@4/mean:np.float64(0.80322) - val-aux/math500/score/best@4/std:np.float64(0.06148540425604767) - val-aux/math500/score/worst@4/mean:np.float64(0.59287) - val-aux/math500/score/worst@4/std:np.float64(0.08751340725605453) - val-aux/math500/score/maj@4/mean:np.float64(0.7180019999999999) - val-aux/math500/score/maj@4/std:np.float64(0.11971755357897815) - val-core/math500/acc/mean@4:np.float64(0.7035) - val-aux/math500/acc/std@4:np.float64(0.1304685657822283) - val-aux/math500/acc/best@2/mean:np.float64(0.762022) - val-aux/math500/acc/best@2/std:np.float64(0.10476828312470628) - val-aux/math500/acc/worst@2/mean:np.float64(0.644648) - val-aux/math500/acc/worst@2/std:np.float64(0.12001357411162539) - val-aux/math500/acc/maj@2/mean:np.float64(0.7034580000000001) - val-aux/math500/acc/maj@2/std:np.float64(0.1306690433222314) - val-core/math500/acc/best@4/mean:np.float64(0.80322) - val-core/math500/acc/best@4/std:np.float64(0.06148540425604767) - val-aux/math500/acc/worst@4/mean:np.float64(0.59287) - val-aux/math500/acc/worst@4/std:np.float64(0.08751340725605453) - val-core/math500/acc/maj@4/mean:np.float64(0.7180019999999999) - val-core/math500/acc/maj@4/std:np.float64(0.11971755357897815) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.12094594594594595 - val-aux/aime2024/response_length/clip_ratio:0.27708333333333335 - val-aux/aime2025/response_length/clip_ratio:0.20208333333333334 - val-aux/math500/response_length/clip_ratio:0.064 - training/global_step:325 - training/epoch:0 - critic/score/mean:0.6820651888847351 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6877079606056213 - critic/rewards/max:1.5239214897155762 - critic/rewards/min:-0.09438826143741608 - critic/advantages/mean:-0.14813418686389923 - critic/advantages/max:2.4692065715789795 - critic/advantages/min:-2.4748013019561768 - critic/returns/mean:-0.14813418686389923 - critic/returns/max:2.4692065715789795 - critic/returns/min:-2.4748013019561768 - response_length/mean:1395.1507568359375 - response_length/max:8192.0 - response_length/min:162.0 - response_length/clip_ratio:0.04211956635117531 - response_length_non_aborted/mean:1395.1507568359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:162.0 - response_length_non_aborted/clip_ratio:0.04211956635117531 - response/aborted_ratio:0.0 - prompt_length/mean:231.20652770996094 - prompt_length/max:329.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.532125502824783e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2944741174578667) - timing_s/agent_loop/generate_sequences/max:np.float64(39.13159018475562) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.110740901656754) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(39.13159018475562) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:199 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:41.1847297437489 - timing_s/reward:0.00021287612617015839 - timing_s/old_log_prob:14.250147779472172 - timing_s/ref:33.42185017000884 - timing_s/adv:0.0710867065936327 - timing_s/update_actor:24.136644404381514 - timing_s/save_checkpoint:52.24955883435905 - timing_s/update_weights:40.69086611364037 - timing_s/step:206.45854523219168 - timing_s/testing:170.80525290593505 - timing_s/stop_profile:0.0004024580121040344 - timing_per_token_ms/adv:5.938744025152293e-05 - timing_per_token_ms/update_actor:0.020164297885279365 - timing_per_token_ms/gen:0.040108576526954196 - timing_per_token_ms/ref:0.02792136849739126 - perf/total_num_tokens:2062670 - perf/time_per_step:206.45854523219168 - perf/throughput:2497.680584836338 - frontier/active_count:55.0 - frontier/completed_count:9.0 - frontier/blacklisted_count:927.0 - frontier/mean_score:2.702225155173705 - frontier/mean_frontier_pct:0.38837650780523253 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:2.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:4.340411 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:80.0 - frontier/cluster_1/score:2.4811630750999996 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:80.0 - frontier/cluster_5/score:2.8709104750999987 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:1.7535856999999995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:3.0108950703539294 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:144.0 - frontier/cluster_8/score:1.4651129969509096 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:80.0 - frontier/cluster_9/score:2.9717524750999993 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.07579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.7598999999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.51 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:3.1763509999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.0881509999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:96.0 - frontier/cluster_18/score:3.490160098798999 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.1763509999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:48.0 - frontier/cluster_21/score:2.8168036999999995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.8709104750999987 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:1.2970036999999999 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.7189413899999995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.827692475099999 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.9475403108999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.736551989999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.3623519899999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.809892506569999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.9423519899999997 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.2212562999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.1991738615099994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.7598999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.3069413899999995 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.5124909999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:3.050816132569999 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.379646393 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9596463929999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.3825509999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.720790681099999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.037448999999999 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.0864119899999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:80.0 - frontier/cluster_55/score:3.3216124750999994 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.3176456999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.1775524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:325.0 - cluster/prob_snapshot/cluster_0:0.029204288726744056 - cluster/prob_snapshot/cluster_1:0.01669441046558875 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021690597986207826 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.019316810878844036 - cluster/prob_snapshot/cluster_6:0.011798934038709599 - cluster/prob_snapshot/cluster_7:0.020258691852118732 - cluster/prob_snapshot/cluster_8:0.009857956534590767 - cluster/prob_snapshot/cluster_9:0.019995322403163353 - cluster/prob_snapshot/cluster_10:0.027423796490603343 - cluster/prob_snapshot/cluster_11:0.018569881160318897 - cluster/prob_snapshot/cluster_12:0.016888438607341006 - cluster/prob_snapshot/cluster_13:0.015899354752648122 - cluster/prob_snapshot/cluster_14:0.021371955720663825 - cluster/prob_snapshot/cluster_15:0.020778505407848098 - cluster/prob_snapshot/cluster_16:0.011830518750166663 - cluster/prob_snapshot/cluster_17:0.012530757182796861 - cluster/prob_snapshot/cluster_18:0.023483408190580923 - cluster/prob_snapshot/cluster_19:0.021371955720663825 - cluster/prob_snapshot/cluster_20:0.019913867094668823 - cluster/prob_snapshot/cluster_21:0.018952755520470512 - cluster/prob_snapshot/cluster_22:0.01548938958757032 - cluster/prob_snapshot/cluster_23:0.019316810878844036 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.011169919096432989 - cluster/prob_snapshot/cluster_26:0.008726839586033517 - cluster/prob_snapshot/cluster_27:0.01829429272588582 - cluster/prob_snapshot/cluster_28:0.019026020225564332 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01983241178617429 - cluster/prob_snapshot/cluster_31:0.018412784971678024 - cluster/prob_snapshot/cluster_32:0.018009070614708033 - cluster/prob_snapshot/cluster_33:0.022623456238799144 - cluster/prob_snapshot/cluster_34:0.017877522462033883 - cluster/prob_snapshot/cluster_35:0.025634715417440623 - cluster/prob_snapshot/cluster_36:0.019797502368248062 - cluster/prob_snapshot/cluster_37:0.01494563771064519 - cluster/prob_snapshot/cluster_38:0.014797056871307042 - cluster/prob_snapshot/cluster_39:0.018569881160318897 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02225062814465734 - cluster/prob_snapshot/cluster_42:0.01017673760862781 - cluster/prob_snapshot/cluster_43:0.020129973015164525 - cluster/prob_snapshot/cluster_44:0.02052729918613305 - cluster/prob_snapshot/cluster_45:0.01406248473678992 - cluster/prob_snapshot/cluster_46:0.02273982096521991 - cluster/prob_snapshot/cluster_47:0.019913867094668823 - cluster/prob_snapshot/cluster_48:0.016030902905322275 - cluster/prob_snapshot/cluster_49:0.02503519720658321 - cluster/prob_snapshot/cluster_50:0.013708897351429605 - cluster/prob_snapshot/cluster_51:0.01656783414072698 - cluster/prob_snapshot/cluster_52:0.011798934038709599 - cluster/prob_snapshot/cluster_53:0.018944564917069796 - cluster/prob_snapshot/cluster_54:0.02076680454584708 - cluster/prob_snapshot/cluster_55:0.022349341977332408 - cluster/prob_snapshot/cluster_56:0.016228578378567324 - cluster/prob_snapshot/cluster_57:0.015594190086859706 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.021380039799733387 - cluster/prob_snapshot/cluster_60:0.011438384714135342 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 22:37:07,669:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  41%|████      | 326/800 [11:07:09<25:55:28, 196.90s/it]
[36m(TaskRunner pid=2823680)[0m step:326 - global_seqlen/min:372720 - global_seqlen/max:561330 - global_seqlen/minmax_diff:188610 - global_seqlen/balanced_min:475785 - global_seqlen/balanced_max:475925 - global_seqlen/mean:475854.25 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.1766701259083398) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.015736600384116173 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09303108534368221) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009199945322072143) - actor/ppo_kl:np.float64(6.177660217981216e-05) - actor/pg_clipfrac_lower:np.float64(1.557048232786221e-05) - actor/grad_norm:np.float64(0.598473347723484) - perf/mfu/actor:np.float64(0.24912653979804344) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(104.89426040649414) - actor/lr:np.float64(1e-06) - training/global_step:326 - training/epoch:0 - critic/score/mean:0.6208791136741638 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6322202682495117 - critic/rewards/max:1.5523359775543213 - critic/rewards/min:-0.13227537274360657 - critic/advantages/mean:-0.11133905500173569 - critic/advantages/max:2.4747931957244873 - critic/advantages/min:-2.474817991256714 - critic/returns/mean:-0.11133905500173569 - critic/returns/max:2.4747931957244873 - critic/returns/min:-2.474817991256714 - response_length/mean:1370.7774658203125 - response_length/max:8192.0 - response_length/min:72.0 - response_length/clip_ratio:0.03846153989434242 - response_length_non_aborted/mean:1370.7774658203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:72.0 - response_length_non_aborted/clip_ratio:0.03846153989434242 - response/aborted_ratio:0.0 - prompt_length/mean:247.61538696289062 - prompt_length/max:694.0 - prompt_length/min:162.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.4717757999897e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.6568051045760512) - timing_s/agent_loop/generate_sequences/max:np.float64(35.052197243086994) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.266955981883257) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.052197243086994) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:37.60352761950344 - timing_s/reward:0.00018066540360450745 - timing_s/old_log_prob:13.513425865210593 - timing_s/ref:30.26199369877577 - timing_s/adv:0.06425898615270853 - timing_s/update_actor:22.982760903425515 - timing_s/update_weights:40.520513258874416 - timing_s/step:145.34170182608068 - timing_s/stop_profile:4.854891449213028e-05 - timing_per_token_ms/adv:5.454042739516422e-05 - timing_per_token_ms/update_actor:0.01950683752486909 - timing_per_token_ms/gen:0.037681679422625965 - timing_per_token_ms/ref:0.02568515578877411 - perf/total_num_tokens:1903417 - perf/time_per_step:145.34170182608068 - perf/throughput:3274.0379672271793 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:964.0 - frontier/mean_score:2.7237033707692992 - frontier/mean_frontier_pct:0.3968071045066695 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:4.340411 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.6368141525699995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.309637332569999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:80.0 - frontier/cluster_6/score:2.127509989999999 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:3.0108950703539294 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:2.3802267325699993 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:64.0 - frontier/cluster_10/score:4.07579 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:32.0 - frontier/cluster_11/score:2.8319299999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.51 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.3629999999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:3.1763509999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.0881509999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:1.8623509999999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.3431120691592993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.1763509999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:64.0 - frontier/cluster_21/score:2.8717625899999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.8709104750999987 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:1.2970036999999999 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:64.0 - frontier/cluster_27/score:2.8032589729999993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.8793847325699993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.9475403108999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:64.0 - frontier/cluster_31/score:2.736551989999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:48.0 - frontier/cluster_32/score:2.676550999999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.3623519899999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.809892506569999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:64.0 - frontier/cluster_36/score:2.9423519899999997 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.2212562999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:112.0 - frontier/cluster_38/score:2.1991738615099994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:64.0 - frontier/cluster_41/score:3.2148589729999992 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:64.0 - frontier/cluster_42/score:1.5124909999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:3.0355712927989993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.379646393 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9596463929999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.3825509999999994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.720790681099999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.0864119899999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.3176456999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.1775524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:326.0 - cluster/prob_snapshot/cluster_0:0.029510548888728304 - cluster/prob_snapshot/cluster_1:0.017927756832223415 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021918063414828517 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015703274509727073 - cluster/prob_snapshot/cluster_6:0.014464986742304552 - cluster/prob_snapshot/cluster_7:0.020471141136751955 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.01618321337720437 - cluster/prob_snapshot/cluster_10:0.0277113849483816 - cluster/prob_snapshot/cluster_11:0.019254353726975704 - cluster/prob_snapshot/cluster_12:0.017065544647893494 - cluster/prob_snapshot/cluster_13:0.01606608844739933 - cluster/prob_snapshot/cluster_14:0.02159607960473352 - cluster/prob_snapshot/cluster_15:0.020996405884437026 - cluster/prob_snapshot/cluster_16:0.011954583288176307 - cluster/prob_snapshot/cluster_17:0.012662164996234695 - cluster/prob_snapshot/cluster_18:0.022729891744681124 - cluster/prob_snapshot/cluster_19:0.02159607960473352 - cluster/prob_snapshot/cluster_20:0.020122700263632838 - cluster/prob_snapshot/cluster_21:0.019525176373623606 - cluster/prob_snapshot/cluster_22:0.01565182405082716 - cluster/prob_snapshot/cluster_23:0.019519382825866197 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.011287056043811947 - cluster/prob_snapshot/cluster_26:0.008818356394754206 - cluster/prob_snapshot/cluster_27:0.01905941879017512 - cluster/prob_snapshot/cluster_28:0.01957699948690685 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.020040390747860484 - cluster/prob_snapshot/cluster_31:0.018605876560409074 - cluster/prob_snapshot/cluster_32:0.01819792852305337 - cluster/prob_snapshot/cluster_33:0.022860704385369933 - cluster/prob_snapshot/cluster_34:0.018065000848387653 - cluster/prob_snapshot/cluster_35:0.02590354210141242 - cluster/prob_snapshot/cluster_36:0.020005115241100904 - cluster/prob_snapshot/cluster_37:0.015102369945045695 - cluster/prob_snapshot/cluster_38:0.014952230964971808 - cluster/prob_snapshot/cluster_39:0.019254353726975704 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02185789615155877 - cluster/prob_snapshot/cluster_42:0.010283459239058596 - cluster/prob_snapshot/cluster_43:0.02034107244833501 - cluster/prob_snapshot/cluster_44:0.020638915310408405 - cluster/prob_snapshot/cluster_45:0.01606608844739933 - cluster/prob_snapshot/cluster_46:0.02297828940790187 - cluster/prob_snapshot/cluster_47:0.020122700263632838 - cluster/prob_snapshot/cluster_48:0.016199016122065053 - cluster/prob_snapshot/cluster_49:0.02529773685011079 - cluster/prob_snapshot/cluster_50:0.015815981672198526 - cluster/prob_snapshot/cluster_51:0.016741578059476167 - cluster/prob_snapshot/cluster_52:0.011922667353489068 - cluster/prob_snapshot/cluster_53:0.019143233187148556 - cluster/prob_snapshot/cluster_54:0.02098458231758518 - cluster/prob_snapshot/cluster_55:0.021927720470469113 - cluster/prob_snapshot/cluster_56:0.01639876458270667 - cluster/prob_snapshot/cluster_57:0.015757723574242374 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02160424846009702 - cluster/prob_snapshot/cluster_60:0.011558337012517506 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 22:39:41,616:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  41%|████      | 327/800 [11:09:40<24:02:48, 183.02s/it]
[36m(TaskRunner pid=2823680)[0m step:327 - global_seqlen/min:423125 - global_seqlen/max:486158 - global_seqlen/minmax_diff:63033 - global_seqlen/balanced_min:457608 - global_seqlen/balanced_max:457690 - global_seqlen/mean:457640.75 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.1866646207585607) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010123373940587044 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.0022347951135088806) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0021646219579649724) - actor/ppo_kl:np.float64(-0.0017418796165489055) - actor/pg_clipfrac_lower:np.float64(0.0003338221427127238) - actor/grad_norm:np.float64(0.3870826984445254) - perf/mfu/actor:np.float64(0.22122469593462857) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(104.95472049713135) - actor/lr:np.float64(1e-06) - training/global_step:327 - training/epoch:0 - critic/score/mean:0.5920329689979553 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6129973530769348 - critic/rewards/max:1.2391555309295654 - critic/rewards/min:-0.08620475232601166 - critic/advantages/mean:-0.11805014312267303 - critic/advantages/max:2.4747774600982666 - critic/advantages/min:-2.4745967388153076 - critic/returns/mean:-0.11805014312267303 - critic/returns/max:2.4747774600982666 - critic/returns/min:-2.4745967388153076 - response_length/mean:1459.232177734375 - response_length/max:8192.0 - response_length/min:124.0 - response_length/clip_ratio:0.057692307978868484 - response_length_non_aborted/mean:1459.232177734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:124.0 - response_length_non_aborted/clip_ratio:0.057692307978868484 - response/aborted_ratio:0.0 - prompt_length/mean:241.6813201904297 - prompt_length/max:406.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.475314825773239e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.054306455887854) - timing_s/agent_loop/generate_sequences/max:np.float64(34.370110562071204) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.862506856397886) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(34.370110562071204) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:36.65653069317341 - timing_s/reward:0.0001214081421494484 - timing_s/old_log_prob:13.64989553578198 - timing_s/ref:32.92854733392596 - timing_s/adv:0.06932049337774515 - timing_s/update_actor:24.949069901369512 - timing_s/update_weights:41.7938840938732 - timing_s/step:150.42792553827167 - timing_s/stop_profile:5.0527043640613556e-05 - timing_per_token_ms/adv:5.5981953279584866e-05 - timing_per_token_ms/update_actor:0.020148409186538836 - timing_per_token_ms/gen:0.03450607744097444 - timing_per_token_ms/ref:0.02659248814585405 - perf/total_num_tokens:1830563 - perf/time_per_step:150.42792553827167 - perf/throughput:3042.2592637799003 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:1001.0 - frontier/mean_score:2.6978070704992434 - frontier/mean_frontier_pct:0.41767733035440685 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.9382876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.6368141525699995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.309637332569999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.7892569929999993 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:112.0 - frontier/cluster_7/score:3.0108950703539294 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:96.0 - frontier/cluster_9/score:2.3802267325699993 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.353052999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.8823509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.51 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.9540999999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:48.0 - frontier/cluster_14/score:3.1763509999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.0881509999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:64.0 - frontier/cluster_17/score:2.203645699999999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.3431120691592993 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.1763509999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.3102338129999995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:80.0 - frontier/cluster_23/score:2.8709104750999987 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:1.2970036999999999 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.8622812810999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:80.0 - frontier/cluster_28/score:2.8793847325699993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:80.0 - frontier/cluster_30/score:2.9475403108999996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.815586392999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.773585699999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.3623519899999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:112.0 - frontier/cluster_35/score:3.809892506569999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.3596463929999993 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.2212562999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.4394217030569996 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:32.0 - frontier/cluster_39/score:2.8319299999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.1504012810999993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.3587436999999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:64.0 - frontier/cluster_43/score:2.9917645699999995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:96.0 - frontier/cluster_44/score:3.0355712927989993 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:64.0 - frontier/cluster_46/score:3.379646393 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9596463929999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.5677856999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.504553476769999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.0864119899999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.3176456999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:80.0 - frontier/cluster_59/score:3.1775524750999993 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:327.0 - cluster/prob_snapshot/cluster_0:0.027033531975363748 - cluster/prob_snapshot/cluster_1:0.018099845703652567 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.022128455313395037 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015854010533949796 - cluster/prob_snapshot/cluster_6:0.01228197120601643 - cluster/prob_snapshot/cluster_7:0.020667644001447136 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.016338556343546626 - cluster/prob_snapshot/cluster_10:0.029880599496566254 - cluster/prob_snapshot/cluster_11:0.01978528077639469 - cluster/prob_snapshot/cluster_12:0.017229357128521364 - cluster/prob_snapshot/cluster_13:0.013413500703124939 - cluster/prob_snapshot/cluster_14:0.02180338077471552 - cluster/prob_snapshot/cluster_15:0.02119795077521927 - cluster/prob_snapshot/cluster_16:0.012069335555608355 - cluster/prob_snapshot/cluster_17:0.015126453685271091 - cluster/prob_snapshot/cluster_18:0.022948076398492265 - cluster/prob_snapshot/cluster_19:0.02180338077471552 - cluster/prob_snapshot/cluster_20:0.020315858437903223 - cluster/prob_snapshot/cluster_21:0.015858104946040893 - cluster/prob_snapshot/cluster_22:0.01580206620113752 - cluster/prob_snapshot/cluster_23:0.01970674974482503 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.011395400704804111 - cluster/prob_snapshot/cluster_26:0.008903003961877922 - cluster/prob_snapshot/cluster_27:0.01964751649177432 - cluster/prob_snapshot/cluster_28:0.01976491946926713 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.020232758831557365 - cluster/prob_snapshot/cluster_31:0.019326989438726013 - cluster/prob_snapshot/cluster_32:0.0190386846820159 - cluster/prob_snapshot/cluster_33:0.023080144712153184 - cluster/prob_snapshot/cluster_34:0.018238407127681776 - cluster/prob_snapshot/cluster_35:0.026152190684052567 - cluster/prob_snapshot/cluster_36:0.01619728701275867 - cluster/prob_snapshot/cluster_37:0.01524733787517051 - cluster/prob_snapshot/cluster_38:0.016744887533480017 - cluster/prob_snapshot/cluster_39:0.01943917662668267 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02162525449012873 - cluster/prob_snapshot/cluster_42:0.009326804961525296 - cluster/prob_snapshot/cluster_43:0.020536326781269782 - cluster/prob_snapshot/cluster_44:0.020837028642518428 - cluster/prob_snapshot/cluster_45:0.01622030712936095 - cluster/prob_snapshot/cluster_46:0.02319885843550441 - cluster/prob_snapshot/cluster_47:0.020315858437903223 - cluster/prob_snapshot/cluster_48:0.017626014683191322 - cluster/prob_snapshot/cluster_49:0.0240562563455265 - cluster/prob_snapshot/cluster_50:0.015967799574571045 - cluster/prob_snapshot/cluster_51:0.016902280778793504 - cluster/prob_snapshot/cluster_52:0.012037113259270166 - cluster/prob_snapshot/cluster_53:0.019326989438726013 - cluster/prob_snapshot/cluster_54:0.021186013713729204 - cluster/prob_snapshot/cluster_55:0.022138205067292426 - cluster/prob_snapshot/cluster_56:0.016556176629081484 - cluster/prob_snapshot/cluster_57:0.01590898225604856 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02181162804307366 - cluster/prob_snapshot/cluster_60:0.011669285704576223 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  41%|████      | 328/800 [11:12:15<22:54:00, 174.66s/it]
[36m(TaskRunner pid=2823680)[0m step:328 - global_seqlen/min:351950 - global_seqlen/max:614739 - global_seqlen/minmax_diff:262789 - global_seqlen/balanced_min:498087 - global_seqlen/balanced_max:498203 - global_seqlen/mean:498161.0 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.1787721736729145) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009787666611373425 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.004742842992527585) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.002081869240064407) - actor/ppo_kl:np.float64(0.0029103423907281467) - actor/pg_clipfrac_lower:np.float64(0.00012344407979981043) - actor/grad_norm:np.float64(0.40750706654328567) - perf/mfu/actor:np.float64(0.24966893848169353) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(104.89312744140625) - actor/lr:np.float64(1e-06) - training/global_step:328 - training/epoch:0 - critic/score/mean:0.6174242496490479 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6426575183868408 - critic/rewards/max:1.6919602155685425 - critic/rewards/min:-0.2785242199897766 - critic/advantages/mean:-0.10473469644784927 - critic/advantages/max:2.4731345176696777 - critic/advantages/min:-2.474735975265503 - critic/returns/mean:-0.10473469644784927 - critic/returns/max:2.4731345176696777 - critic/returns/min:-2.474735975265503 - response_length/mean:1495.0391845703125 - response_length/max:8192.0 - response_length/min:135.0 - response_length/clip_ratio:0.06691919267177582 - response_length_non_aborted/mean:1495.0391845703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:135.0 - response_length_non_aborted/clip_ratio:0.06691919267177582 - response/aborted_ratio:0.0 - prompt_length/mean:245.7979736328125 - prompt_length/max:728.0 - prompt_length/min:183.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.332410991191864e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5676941014826298) - timing_s/agent_loop/generate_sequences/max:np.float64(38.34973802417517) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.273186732590148) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(38.34973802417517) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:220 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.756796452216804 - timing_s/reward:0.0002454286441206932 - timing_s/old_log_prob:14.381189071573317 - timing_s/ref:32.75799554400146 - timing_s/adv:0.08193792216479778 - timing_s/update_actor:24.271731070242822 - timing_s/update_weights:41.75459066964686 - timing_s/step:154.3997626239434 - timing_s/stop_profile:6.55241310596466e-05 - timing_per_token_ms/adv:5.942943838322137e-05 - timing_per_token_ms/update_actor:0.017604246092450024 - timing_per_token_ms/gen:0.0344209058850498 - timing_per_token_ms/ref:0.023759319571523815 - perf/total_num_tokens:1992644 - perf/time_per_step:154.3997626239434 - perf/throughput:3226.436307504712 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:1030.0 - frontier/mean_score:2.709826021992693 - frontier/mean_frontier_pct:0.4384203009425715 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:64.0 - frontier/cluster_0/score:3.9382876999999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.6368141525699995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.309637332569999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.7892569929999993 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:3.0076265492477505 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9661587127989995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.353052999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.8823509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.51 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.9540999999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.5234456999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:48.0 - frontier/cluster_15/score:3.0881509999999994 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:2.4425519899999992 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.240178448411509 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.1763509999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.3102338129999995 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:64.0 - frontier/cluster_22/score:2.3020699999999996 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.909637332569999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:1.2970036999999999 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:80.0 - frontier/cluster_27/score:2.8622812810999996 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:3.5155693127989993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:96.0 - frontier/cluster_30/score:2.9632782176299997 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.815586392999999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.773585699999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.3623519899999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:4.166924754598999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.3596463929999993 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:64.0 - frontier/cluster_37/score:2.2212562999999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.6075951921398994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.1504012810999993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:80.0 - frontier/cluster_42/score:1.8511205899999998 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9942351989999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.424899904959299 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:3.2657524750999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9596463929999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.5677856999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:96.0 - frontier/cluster_49/score:3.504553476769999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.0864119899999993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.3176456999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:328.0 - cluster/prob_snapshot/cluster_0:0.026913629551048854 - cluster/prob_snapshot/cluster_1:0.018019567056320386 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.022030308484455415 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015783692889186733 - cluster/prob_snapshot/cluster_6:0.01222749670655742 - cluster/prob_snapshot/cluster_7:0.020553639789788174 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.013436414824351033 - cluster/prob_snapshot/cluster_10:0.02974806941049072 - cluster/prob_snapshot/cluster_11:0.019697526681480178 - cluster/prob_snapshot/cluster_12:0.017152939378484874 - cluster/prob_snapshot/cluster_13:0.013354007505775813 - cluster/prob_snapshot/cluster_14:0.01724482514621447 - cluster/prob_snapshot/cluster_15:0.02110393103370814 - cluster/prob_snapshot/cluster_16:0.01201580416376854 - cluster/prob_snapshot/cluster_17:0.0166920104435329 - cluster/prob_snapshot/cluster_18:0.022142862351026214 - cluster/prob_snapshot/cluster_19:0.021706675756091556 - cluster/prob_snapshot/cluster_20:0.020225751060111718 - cluster/prob_snapshot/cluster_21:0.01578776914124102 - cluster/prob_snapshot/cluster_22:0.015731978946226563 - cluster/prob_snapshot/cluster_23:0.019883997123087502 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.011344858431164439 - cluster/prob_snapshot/cluster_26:0.008863516270824932 - cluster/prob_snapshot/cluster_27:0.019560373425848734 - cluster/prob_snapshot/cluster_28:0.02402483956306891 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02025057036995701 - cluster/prob_snapshot/cluster_31:0.01924126801355214 - cluster/prob_snapshot/cluster_32:0.018954241981327696 - cluster/prob_snapshot/cluster_33:0.02297777683410278 - cluster/prob_snapshot/cluster_34:0.018157513915790558 - cluster/prob_snapshot/cluster_35:0.02847609868938032 - cluster/prob_snapshot/cluster_36:0.016125446826210952 - cluster/prob_snapshot/cluster_37:0.015179711019114583 - cluster/prob_snapshot/cluster_38:0.01781988934438419 - cluster/prob_snapshot/cluster_39:0.019697526681480178 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021529339518967877 - cluster/prob_snapshot/cluster_42:0.012650262654396475 - cluster/prob_snapshot/cluster_43:0.02046212543959067 - cluster/prob_snapshot/cluster_44:0.016571378911816967 - cluster/prob_snapshot/cluster_45:0.016148364841179184 - cluster/prob_snapshot/cluster_46:0.02231763116754073 - cluster/prob_snapshot/cluster_47:0.020225751060111718 - cluster/prob_snapshot/cluster_48:0.017547837629099738 - cluster/prob_snapshot/cluster_49:0.023949559097886132 - cluster/prob_snapshot/cluster_50:0.01589697723875084 - cluster/prob_snapshot/cluster_51:0.016827313717749637 - cluster/prob_snapshot/cluster_52:0.011983724783696397 - cluster/prob_snapshot/cluster_53:0.01924126801355214 - cluster/prob_snapshot/cluster_54:0.021092046916931815 - cluster/prob_snapshot/cluster_55:0.02204001499505297 - cluster/prob_snapshot/cluster_56:0.01648274465145379 - cluster/prob_snapshot/cluster_57:0.015838420793986507 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025451181096362036 - cluster/prob_snapshot/cluster_60:0.011617528662718838 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  41%|████      | 329/800 [11:14:47<21:58:37, 167.98s/it]
[36m(TaskRunner pid=2823680)[0m step:329 - global_seqlen/min:424889 - global_seqlen/max:611362 - global_seqlen/minmax_diff:186473 - global_seqlen/balanced_min:533185 - global_seqlen/balanced_max:533304 - global_seqlen/mean:533252.75 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.17354688507815202) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008051756769418716 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.02778944444435183) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0027400253448932846) - actor/ppo_kl:np.float64(-0.00028173768287160784) - actor/pg_clipfrac_lower:np.float64(0.0005265109968200301) - actor/grad_norm:np.float64(0.2887571131189664) - perf/mfu/actor:np.float64(0.29780184546105687) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(105.1535873413086) - actor/lr:np.float64(1e-06) - training/global_step:329 - training/epoch:0 - critic/score/mean:0.612500011920929 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6385146379470825 - critic/rewards/max:1.7374839782714844 - critic/rewards/min:-0.0908379778265953 - critic/advantages/mean:-0.025784488767385483 - critic/advantages/max:2.474613666534424 - critic/advantages/min:-2.4748411178588867 - critic/returns/mean:-0.025784488767385483 - critic/returns/max:2.474613666534424 - critic/returns/min:-2.4748411178588867 - response_length/mean:1487.4403076171875 - response_length/max:8192.0 - response_length/min:173.0 - response_length/clip_ratio:0.05000000074505806 - response_length_non_aborted/mean:1487.4403076171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:173.0 - response_length_non_aborted/clip_ratio:0.05000000074505806 - response/aborted_ratio:0.0 - prompt_length/mean:239.8000030517578 - prompt_length/max:728.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.121823728084564e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.377494715154171) - timing_s/agent_loop/generate_sequences/max:np.float64(40.64913451205939) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.6347241741978) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(40.64913451205939) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:185 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:42.369033347815275 - timing_s/reward:0.00014293286949396133 - timing_s/old_log_prob:12.918734698556364 - timing_s/ref:32.63769512530416 - timing_s/adv:0.06221220176666975 - timing_s/update_actor:21.958441586233675 - timing_s/update_weights:41.79078047256917 - timing_s/step:152.12738945242018 - timing_s/stop_profile:4.8937276005744934e-05 - timing_per_token_ms/adv:5.002537104924904e-05 - timing_per_token_ms/update_actor:0.017656973339964823 - timing_per_token_ms/gen:0.03956184361072879 - timing_per_token_ms/ref:0.026244253739148887 - perf/total_num_tokens:2133011 - perf/time_per_step:152.12738945242018 - perf/throughput:3505.30402131683 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:1068.0 - frontier/mean_score:2.6857926850142664 - frontier/mean_frontier_pct:0.450347575312058 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:80.0 - frontier/cluster_0/score:3.6568013899999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.6368141525699995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.309637332569999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:96.0 - frontier/cluster_6/score:1.7892569929999993 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:3.0053385844734253 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9661587127989995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:80.0 - frontier/cluster_10/score:4.353052999999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.8823509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.51 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.9540999999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.5234456999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:3.061705699999999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:2.4425519899999992 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:112.0 - frontier/cluster_18/score:3.240178448411509 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:48.0 - frontier/cluster_19/score:3.1234456999999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:80.0 - frontier/cluster_22/score:1.9114489999999997 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:96.0 - frontier/cluster_23/score:2.936746132798999 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:1.2970036999999999 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.3035968967699993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:3.3608985189592993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:2.3742947523409996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:80.0 - frontier/cluster_31/score:2.8709104750999987 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.841509989999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:64.0 - frontier/cluster_33/score:3.3623519899999996 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:4.166924754598999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.3596463929999993 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.4548794099999993 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.6075951921398994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.1504012810999993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.1957844129999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9942351989999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.424899904959299 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:80.0 - frontier/cluster_46/score:3.2657524750999998 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:64.0 - frontier/cluster_47/score:2.9596463929999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.5677856999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.353187433738999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.3176456999999995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:329.0 - cluster/prob_snapshot/cluster_0:0.025213615569475552 - cluster/prob_snapshot/cluster_1:0.018180811939325046 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.022227442772034575 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015924930450852 - cluster/prob_snapshot/cluster_6:0.012336912280734446 - cluster/prob_snapshot/cluster_7:0.020721784872496117 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.013556648186761071 - cluster/prob_snapshot/cluster_10:0.03001426470567827 - cluster/prob_snapshot/cluster_11:0.01987378648701876 - cluster/prob_snapshot/cluster_12:0.017306429398229812 - cluster/prob_snapshot/cluster_13:0.013473503460988395 - cluster/prob_snapshot/cluster_14:0.01739913738936916 - cluster/prob_snapshot/cluster_15:0.021110435671397518 - cluster/prob_snapshot/cluster_16:0.012123325444970143 - cluster/prob_snapshot/cluster_17:0.016841375922884746 - cluster/prob_snapshot/cluster_18:0.022341003806812587 - cluster/prob_snapshot/cluster_19:0.021536132464643223 - cluster/prob_snapshot/cluster_20:0.020406737587322716 - cluster/prob_snapshot/cluster_21:0.017355822837875796 - cluster/prob_snapshot/cluster_22:0.013179425166062539 - cluster/prob_snapshot/cluster_23:0.020248840481199328 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.011446375874104107 - cluster/prob_snapshot/cluster_26:0.00894282986585372 - cluster/prob_snapshot/cluster_27:0.01588328169559015 - cluster/prob_snapshot/cluster_28:0.023173367702384165 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01637074282947293 - cluster/prob_snapshot/cluster_31:0.019794904161735672 - cluster/prob_snapshot/cluster_32:0.019592188058286726 - cluster/prob_snapshot/cluster_33:0.02318338937327988 - cluster/prob_snapshot/cluster_34:0.018319993191671952 - cluster/prob_snapshot/cluster_35:0.02873091198135604 - cluster/prob_snapshot/cluster_36:0.016269742512048654 - cluster/prob_snapshot/cluster_37:0.016926373382602805 - cluster/prob_snapshot/cluster_38:0.01797934744698513 - cluster/prob_snapshot/cluster_39:0.01987378648701876 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02172199097507964 - cluster/prob_snapshot/cluster_42:0.015139915504907564 - cluster/prob_snapshot/cluster_43:0.020645227120792062 - cluster/prob_snapshot/cluster_44:0.016719664941415253 - cluster/prob_snapshot/cluster_45:0.016292865604787665 - cluster/prob_snapshot/cluster_46:0.022517336510921276 - cluster/prob_snapshot/cluster_47:0.020406737587322716 - cluster/prob_snapshot/cluster_48:0.017704861325431916 - cluster/prob_snapshot/cluster_49:0.02312019983308183 - cluster/prob_snapshot/cluster_50:0.016039228505220146 - cluster/prob_snapshot/cluster_51:0.016977889934326917 - cluster/prob_snapshot/cluster_52:0.012090959008285017 - cluster/prob_snapshot/cluster_53:0.01941344506974941 - cluster/prob_snapshot/cluster_54:0.021102042349624022 - cluster/prob_snapshot/cluster_55:0.022237236139611587 - cluster/prob_snapshot/cluster_56:0.016630237553176264 - cluster/prob_snapshot/cluster_57:0.01598014807855016 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02567892645258434 - cluster/prob_snapshot/cluster_60:0.011721486046609832 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  41%|████▏     | 330/800 [11:17:07<20:49:52, 159.56s/it]
[36m(TaskRunner pid=2823680)[0m step:330 - global_seqlen/min:411515 - global_seqlen/max:569181 - global_seqlen/minmax_diff:157666 - global_seqlen/balanced_min:474703 - global_seqlen/balanced_max:474819 - global_seqlen/mean:474758.75 - frontier/skipped_zero_acc_count:23.0 - actor/entropy:np.float64(0.18150074576150696) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011237024329602718 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.03675808997650165) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0016295607471514607) - actor/ppo_kl:np.float64(-0.00031043104025968123) - actor/pg_clipfrac_lower:np.float64(0.00024884076901798835) - actor/grad_norm:np.float64(0.3212448443685259) - perf/mfu/actor:np.float64(0.20719043249919705) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(105.1575231552124) - actor/lr:np.float64(1e-06) - training/global_step:330 - training/epoch:0 - critic/score/mean:0.648809552192688 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6660721302032471 - critic/rewards/max:1.5823915004730225 - critic/rewards/min:-0.4796715974807739 - critic/advantages/mean:-0.16129912436008453 - critic/advantages/max:2.4746594429016113 - critic/advantages/min:-2.4748446941375732 - critic/returns/mean:-0.16129912436008453 - critic/returns/max:2.4746594429016113 - critic/returns/min:-2.4748446941375732 - response_length/mean:1480.0738525390625 - response_length/max:8192.0 - response_length/min:137.0 - response_length/clip_ratio:0.07500000298023224 - response_length_non_aborted/mean:1480.0738525390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:137.0 - response_length_non_aborted/clip_ratio:0.07500000298023224 - response/aborted_ratio:0.0 - prompt_length/mean:237.98095703125 - prompt_length/max:411.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.83219763636589e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2128904638811946) - timing_s/agent_loop/generate_sequences/max:np.float64(38.56385129131377) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.427594343004785) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(38.56385129131377) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.935332597233355 - timing_s/reward:0.00015258695930242538 - timing_s/old_log_prob:15.741873051971197 - timing_s/ref:22.35156558547169 - timing_s/adv:0.08815432898700237 - timing_s/update_actor:27.917615528218448 - timing_s/update_weights:31.642243574373424 - timing_s/step:139.1210849871859 - timing_s/stop_profile:5.112960934638977e-05 - timing_per_token_ms/adv:6.108398409261469e-05 - timing_per_token_ms/update_actor:0.019344701529982307 - timing_per_token_ms/gen:0.03292574903538704 - timing_per_token_ms/ref:0.015487868745155921 - perf/total_num_tokens:1899035 - perf/time_per_step:139.1210849871859 - perf/throughput:3412.5578451586175 - frontier/active_count:54.0 - frontier/completed_count:10.0 - frontier/blacklisted_count:1091.0 - frontier/mean_score:2.7300856088344347 - frontier/mean_frontier_pct:0.47554400089302035 - frontier/batch_easy_count:5.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:4.0597609729999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.6368141525699995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.309637332569999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.5524798950999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:128.0 - frontier/cluster_7/score:3.0053385844734253 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9661587127989995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:96.0 - frontier/cluster_10/score:4.547137099999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.8823509999999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:2.6569999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.9540999999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.5234456999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:64.0 - frontier/cluster_15/score:3.0431939899999993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:2.4425519899999992 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:128.0 - frontier/cluster_18/score:3.168124913888056 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.6864119899999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:64.0 - frontier/cluster_20/score:2.9596463929999994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:80.0 - frontier/cluster_22/score:1.9114489999999997 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.9557222929592992 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:1.2970036999999999 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.3035968967699993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:96.0 - frontier/cluster_28/score:3.3608985189592993 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:112.0 - frontier/cluster_30/score:2.3742947523409996 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:64.0 - frontier/cluster_32/score:2.841509989999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:3.2536463929999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:4.166924754598999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.3596463929999993 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.6184155869999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.6075951921398994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.1504012810999993 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.1957844129999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9942351989999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.597429933471509 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.78602673257 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.5717524750999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.5677856999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.353187433738999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:32.0 - frontier/cluster_56/score:2.4119299999999995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5223519899999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:330.0 - cluster/prob_snapshot/cluster_0:0.02753787592446833 - cluster/prob_snapshot/cluster_1:0.017885846347179702 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.021866824619170423 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01566656429228959 - cluster/prob_snapshot/cluster_6:0.010530668926280011 - cluster/prob_snapshot/cluster_7:0.020385594521612857 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.013336705052578774 - cluster/prob_snapshot/cluster_10:0.03084381028935684 - cluster/prob_snapshot/cluster_11:0.019551354066570367 - cluster/prob_snapshot/cluster_12:0.0180227695221288 - cluster/prob_snapshot/cluster_13:0.013254909267290886 - cluster/prob_snapshot/cluster_14:0.017116853689389148 - cluster/prob_snapshot/cluster_15:0.020642372560367907 - cluster/prob_snapshot/cluster_16:0.011926636546774593 - cluster/prob_snapshot/cluster_17:0.016568141347981572 - cluster/prob_snapshot/cluster_18:0.02148979493425615 - cluster/prob_snapshot/cluster_19:0.02500540220526239 - cluster/prob_snapshot/cluster_20:0.020075658565313823 - cluster/prob_snapshot/cluster_21:0.01707424187341565 - cluster/prob_snapshot/cluster_22:0.012965602100227163 - cluster/prob_snapshot/cluster_23:0.020049040894777385 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.011260669809441484 - cluster/prob_snapshot/cluster_26:0.008797741345294801 - cluster/prob_snapshot/cluster_27:0.015625591246660457 - cluster/prob_snapshot/cluster_28:0.022797402858286596 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.016105143808446264 - cluster/prob_snapshot/cluster_31:0.01973644073688021 - cluster/prob_snapshot/cluster_32:0.019274324292283214 - cluster/prob_snapshot/cluster_33:0.02206989802316323 - cluster/prob_snapshot/cluster_34:0.0180227695221288 - cluster/prob_snapshot/cluster_35:0.028264781508540036 - cluster/prob_snapshot/cluster_36:0.016005782120723202 - cluster/prob_snapshot/cluster_37:0.017761046532800372 - cluster/prob_snapshot/cluster_38:0.01768765041586322 - cluster/prob_snapshot/cluster_39:0.019551354066570367 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02136957327492834 - cluster/prob_snapshot/cluster_42:0.014894285433113238 - cluster/prob_snapshot/cluster_43:0.02031027884329035 - cluster/prob_snapshot/cluster_44:0.0176186981711838 - cluster/prob_snapshot/cluster_45:0.016028530064279393 - cluster/prob_snapshot/cluster_46:0.025681101695795065 - cluster/prob_snapshot/cluster_47:0.02422765210719623 - cluster/prob_snapshot/cluster_48:0.017417617558644398 - cluster/prob_snapshot/cluster_49:0.022745097584786045 - cluster/prob_snapshot/cluster_50:0.015779007974399763 - cluster/prob_snapshot/cluster_51:0.01670244055535693 - cluster/prob_snapshot/cluster_52:0.011894795223334923 - cluster/prob_snapshot/cluster_53:0.019098481230967614 - cluster/prob_snapshot/cluster_54:0.020759682699356168 - cluster/prob_snapshot/cluster_55:0.021876459099098407 - cluster/prob_snapshot/cluster_56:0.016360428488335757 - cluster/prob_snapshot/cluster_57:0.017109434915112128 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025262311409646687 - cluster/prob_snapshot/cluster_60:0.010106859837400043 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  41%|████▏     | 331/800 [11:19:39<20:28:14, 157.13s/it]
[36m(TaskRunner pid=2823680)[0m step:331 - global_seqlen/min:389197 - global_seqlen/max:533218 - global_seqlen/minmax_diff:144021 - global_seqlen/balanced_min:465092 - global_seqlen/balanced_max:465291 - global_seqlen/mean:465233.5 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.18493727733362086) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.013994966633617878 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04428279101557564) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008443902601988045) - actor/ppo_kl:np.float64(0.00015941715616570352) - actor/pg_clipfrac_lower:np.float64(1.88977497526473e-05) - actor/grad_norm:np.float64(0.43959831962218654) - perf/mfu/actor:np.float64(0.21979677798635194) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(104.65852355957031) - actor/lr:np.float64(1e-06) - training/global_step:331 - training/epoch:0 - critic/score/mean:0.6785714030265808 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6988496780395508 - critic/rewards/max:1.4966964721679688 - critic/rewards/min:-0.09571602940559387 - critic/advantages/mean:-0.07962770015001297 - critic/advantages/max:2.473951578140259 - critic/advantages/min:-2.4744865894317627 - critic/returns/mean:-0.07962770015001297 - critic/returns/max:2.473951578140259 - critic/returns/min:-2.4744865894317627 - response_length/mean:1345.654296875 - response_length/max:8192.0 - response_length/min:105.0 - response_length/clip_ratio:0.051020409911870956 - response_length_non_aborted/mean:1345.654296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:105.0 - response_length_non_aborted/clip_ratio:0.051020409911870956 - response/aborted_ratio:0.0 - prompt_length/mean:232.24490356445312 - prompt_length/max:361.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.152572602033615e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8489881595596671) - timing_s/agent_loop/generate_sequences/max:np.float64(36.93432316277176) - timing_s/agent_loop/generate_sequences/mean:np.float64(7.941203979492457) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(36.93432316277176) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:39.326052034273744 - timing_s/reward:0.00011530425399541855 - timing_s/old_log_prob:14.79207987524569 - timing_s/ref:31.214481028728187 - timing_s/adv:0.08844056259840727 - timing_s/update_actor:25.621023767627776 - timing_s/update_weights:39.81403506360948 - timing_s/step:151.255452551879 - timing_s/stop_profile:5.645211786031723e-05 - timing_per_token_ms/adv:7.149178956974023e-05 - timing_per_token_ms/update_actor:0.02071100393236921 - timing_per_token_ms/gen:0.037276126035218946 - timing_per_token_ms/ref:0.025232529550582857 - perf/total_num_tokens:1860934 - perf/time_per_step:151.255452551879 - perf/throughput:3075.8130840964554 - frontier/active_count:53.0 - frontier/completed_count:11.0 - frontier/blacklisted_count:1120.0 - frontier/mean_score:2.7506388792077683 - frontier/mean_frontier_pct:0.4823384462810483 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:4.0597609729999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.6368141525699995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:80.0 - frontier/cluster_3/score:3.223708372999999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.309637332569999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.5524798950999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9661587127989995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:112.0 - frontier/cluster_10/score:4.682995969999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.7598999999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.9540999999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.5234456999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:3.0302357929999992 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:2.6097863929999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:144.0 - frontier/cluster_18/score:3.7176874397216393 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.6864119899999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:80.0 - frontier/cluster_22/score:1.9114489999999997 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:112.0 - frontier/cluster_23/score:2.9690056050715095 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:1.6601 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:80.0 - frontier/cluster_26/score:1.8079025899999999 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.3035968967699993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:3.2526289632715093 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.9620063266386998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:80.0 - frontier/cluster_32/score:2.889056992999999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:3.2536463929999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:4.166924754598999 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.5517524750999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:80.0 - frontier/cluster_37/score:2.6184155869999994 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:128.0 - frontier/cluster_38/score:2.6075951921398994 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.105280896769999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.1957844129999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9942351989999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:112.0 - frontier/cluster_44/score:2.597429933471509 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.78602673257 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.5717524750999994 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.5677856999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.353187433738999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:48.0 - frontier/cluster_51/score:2.462350999999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.5883509999999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:64.0 - frontier/cluster_57/score:2.5223519899999998 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:331.0 - cluster/prob_snapshot/cluster_0:0.02784780809306794 - cluster/prob_snapshot/cluster_1:0.01808714724492597 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02211293046964326 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015842887704412317 - cluster/prob_snapshot/cluster_6:0.010649189071627404 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.01348680645946493 - cluster/prob_snapshot/cluster_10:0.032122869779892954 - cluster/prob_snapshot/cluster_11:0.020013502784407618 - cluster/prob_snapshot/cluster_12:0.01893145090738282 - cluster/prob_snapshot/cluster_13:0.013404090082291665 - cluster/prob_snapshot/cluster_14:0.017309499759772553 - cluster/prob_snapshot/cluster_15:0.020785811135538877 - cluster/prob_snapshot/cluster_16:0.01206086797185545 - cluster/prob_snapshot/cluster_17:0.01790175114237298 - cluster/prob_snapshot/cluster_18:0.025501364996588265 - cluster/prob_snapshot/cluster_19:0.025286831991402733 - cluster/prob_snapshot/cluster_20:0.020384646578227122 - cluster/prob_snapshot/cluster_21:0.01726640835806162 - cluster/prob_snapshot/cluster_22:0.013111526832662772 - cluster/prob_snapshot/cluster_23:0.020365804506016782 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.011387405939108744 - cluster/prob_snapshot/cluster_26:0.012401253352626998 - cluster/prob_snapshot/cluster_27:0.015801453517011725 - cluster/prob_snapshot/cluster_28:0.022311310387371306 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013458323291689936 - cluster/prob_snapshot/cluster_31:0.019958569629275426 - cluster/prob_snapshot/cluster_32:0.019817399409982436 - cluster/prob_snapshot/cluster_33:0.02231828941594358 - cluster/prob_snapshot/cluster_34:0.018225611457268795 - cluster/prob_snapshot/cluster_35:0.02858289482461291 - cluster/prob_snapshot/cluster_36:0.017503669230822944 - cluster/prob_snapshot/cluster_37:0.01796094283865954 - cluster/prob_snapshot/cluster_38:0.017886720666083548 - cluster/prob_snapshot/cluster_39:0.019771399853018505 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021300580764098315 - cluster/prob_snapshot/cluster_42:0.015061917032467082 - cluster/prob_snapshot/cluster_43:0.020538866145521983 - cluster/prob_snapshot/cluster_44:0.01781699238047846 - cluster/prob_snapshot/cluster_45:0.016208927314085873 - cluster/prob_snapshot/cluster_46:0.025970136317144805 - cluster/prob_snapshot/cluster_47:0.02450032850309023 - cluster/prob_snapshot/cluster_48:0.017613648654019938 - cluster/prob_snapshot/cluster_49:0.023001088186196186 - cluster/prob_snapshot/cluster_50:0.015956596913113477 - cluster/prob_snapshot/cluster_51:0.016890422505614327 - cluster/prob_snapshot/cluster_52:0.012028668281980699 - cluster/prob_snapshot/cluster_53:0.01931343004260102 - cluster/prob_snapshot/cluster_54:0.020993327933872395 - cluster/prob_snapshot/cluster_55:0.02212267338332502 - cluster/prob_snapshot/cluster_56:0.017754715709835583 - cluster/prob_snapshot/cluster_57:0.017301997488975818 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02554663264707639 - cluster/prob_snapshot/cluster_60:0.010220610113410054 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▏     | 332/800 [11:22:15<20:22:49, 156.77s/it]
[36m(TaskRunner pid=2823680)[0m step:332 - global_seqlen/min:391610 - global_seqlen/max:610458 - global_seqlen/minmax_diff:218848 - global_seqlen/balanced_min:502230 - global_seqlen/balanced_max:502403 - global_seqlen/mean:502324.0 - frontier/skipped_zero_acc_count:38.0 - actor/entropy:np.float64(0.1616988264438179) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.014182784594595432 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.06339031687457464) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0016402152609468128) - actor/ppo_kl:np.float64(0.0015460265240562876) - actor/pg_clipfrac_lower:np.float64(0.0001428787476874681) - actor/grad_norm:np.float64(0.41819046437740326) - perf/mfu/actor:np.float64(0.25613003548487373) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(105.72206115722656) - actor/lr:np.float64(1e-06) - training/global_step:332 - training/epoch:0 - critic/score/mean:0.6291666626930237 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6520010828971863 - critic/rewards/max:1.6445541381835938 - critic/rewards/min:-0.10574878752231598 - critic/advantages/mean:-0.1124548390507698 - critic/advantages/max:2.473939895629883 - critic/advantages/min:-2.474821090698242 - critic/returns/mean:-0.1124548390507698 - critic/returns/max:2.473939895629883 - critic/returns/min:-2.474821090698242 - response_length/mean:1526.7861328125 - response_length/max:8192.0 - response_length/min:119.0 - response_length/clip_ratio:0.0763888880610466 - response_length_non_aborted/mean:1526.7861328125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:119.0 - response_length_non_aborted/clip_ratio:0.0763888880610466 - response/aborted_ratio:0.0 - prompt_length/mean:242.92222595214844 - prompt_length/max:694.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.583255112171173e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9858715971931815) - timing_s/agent_loop/generate_sequences/max:np.float64(38.522065015509725) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.821086000232754) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(38.522065015509725) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.79880119394511 - timing_s/reward:0.0001674918457865715 - timing_s/old_log_prob:12.70388441067189 - timing_s/ref:34.64311562106013 - timing_s/adv:0.07542835921049118 - timing_s/update_actor:23.9186275517568 - timing_s/update_weights:42.46700287703425 - timing_s/step:155.01546606142074 - timing_s/stop_profile:6.88266009092331e-05 - timing_per_token_ms/adv:5.91971049925766e-05 - timing_per_token_ms/update_actor:0.01877163339200339 - timing_per_token_ms/gen:0.037113909568524575 - timing_per_token_ms/ref:0.02718834367014349 - perf/total_num_tokens:2009296 - perf/time_per_step:155.01546606142074 - perf/throughput:3240.4766618639687 - frontier/active_count:53.0 - frontier/completed_count:11.0 - frontier/blacklisted_count:1158.0 - frontier/mean_score:2.7390735699492392 - frontier/mean_frontier_pct:0.4988066726581894 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:4.0597609729999995 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.6368141525699995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.556595861099999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:96.0 - frontier/cluster_5/score:2.309637332569999 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.5524798950999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9661587127989995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:128.0 - frontier/cluster_10/score:4.778097178999999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.8319299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:48.0 - frontier/cluster_13/score:1.9540999999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.5234456999999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:3.0302357929999992 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:2.6097863929999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:4.102381207805147 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.6864119899999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:80.0 - frontier/cluster_22/score:1.9114489999999997 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.3783039235500563 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:96.0 - frontier/cluster_26/score:2.1655318129999994 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.3035968967699993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:3.2526289632715093 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.9620063266386998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:96.0 - frontier/cluster_32/score:2.322339895099999 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:3.2536463929999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:16.0 - frontier/cluster_34/score:2.6569999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:128.0 - frontier/cluster_35/score:3.816847328219299 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.5517524750999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:96.0 - frontier/cluster_37/score:2.7328909108999992 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.725316634497929 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.105280896769999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.1957844129999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:80.0 - frontier/cluster_43/score:2.9942351989999993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.718200953430056 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.5502187127989995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:80.0 - frontier/cluster_47/score:3.4002267325699993 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.5677856999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.353187433738999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.623645699999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.5883509999999994 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.6656463929999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.49 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:332.0 - cluster/prob_snapshot/cluster_0:0.027965391102265636 - cluster/prob_snapshot/cluster_1:0.018163517392039617 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.01761093907291351 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01590978181702758 - cluster/prob_snapshot/cluster_6:0.010694153605007282 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.013543752387944146 - cluster/prob_snapshot/cluster_10:0.03291360186573406 - cluster/prob_snapshot/cluster_11:0.020098006666153443 - cluster/prob_snapshot/cluster_12:0.019507559817177226 - cluster/prob_snapshot/cluster_13:0.013460686753820193 - cluster/prob_snapshot/cluster_14:0.017382586412146014 - cluster/prob_snapshot/cluster_15:0.020873575968367496 - cluster/prob_snapshot/cluster_16:0.012111793098347538 - cluster/prob_snapshot/cluster_17:0.01797733848347335 - cluster/prob_snapshot/cluster_18:0.028258977730425072 - cluster/prob_snapshot/cluster_19:0.02539360168001481 - cluster/prob_snapshot/cluster_20:0.02047071755652778 - cluster/prob_snapshot/cluster_21:0.017339313063738704 - cluster/prob_snapshot/cluster_22:0.013166888201679983 - cluster/prob_snapshot/cluster_23:0.01638278702230635 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01420443085535541 - cluster/prob_snapshot/cluster_26:0.014917120613185264 - cluster/prob_snapshot/cluster_27:0.015868172680258554 - cluster/prob_snapshot/cluster_28:0.02240551640192453 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013515148954453183 - cluster/prob_snapshot/cluster_31:0.020042841564375267 - cluster/prob_snapshot/cluster_32:0.015997282566829096 - cluster/prob_snapshot/cluster_33:0.022412524898352156 - cluster/prob_snapshot/cluster_34:0.01830256624783801 - cluster/prob_snapshot/cluster_35:0.02629209675672459 - cluster/prob_snapshot/cluster_36:0.0175775757333844 - cluster/prob_snapshot/cluster_37:0.018825335696221905 - cluster/prob_snapshot/cluster_38:0.018773160801367438 - cluster/prob_snapshot/cluster_39:0.01985488149304559 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021390519131079692 - cluster/prob_snapshot/cluster_42:0.015125513618706284 - cluster/prob_snapshot/cluster_43:0.020625588291797486 - cluster/prob_snapshot/cluster_44:0.018724144909706454 - cluster/prob_snapshot/cluster_45:0.01627736697163764 - cluster/prob_snapshot/cluster_46:0.024455443426916777 - cluster/prob_snapshot/cluster_47:0.02342223373373436 - cluster/prob_snapshot/cluster_48:0.017688019527475007 - cluster/prob_snapshot/cluster_49:0.023098206679497916 - cluster/prob_snapshot/cluster_50:0.016023971145057627 - cluster/prob_snapshot/cluster_51:0.018072807389953073 - cluster/prob_snapshot/cluster_52:0.012079457450324195 - cluster/prob_snapshot/cluster_53:0.019394977976813606 - cluster/prob_snapshot/cluster_54:0.021081968973888515 - cluster/prob_snapshot/cluster_55:0.02221608290766588 - cluster/prob_snapshot/cluster_56:0.017829682216845222 - cluster/prob_snapshot/cluster_57:0.018362126345951423 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02565449930327685 - cluster/prob_snapshot/cluster_60:0.010263765039246758 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 22:54:40,567:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▏     | 333/800 [11:24:30<19:30:26, 150.38s/it]
[36m(TaskRunner pid=2823680)[0m step:333 - global_seqlen/min:460670 - global_seqlen/max:693320 - global_seqlen/minmax_diff:232650 - global_seqlen/balanced_min:550947 - global_seqlen/balanced_max:550955 - global_seqlen/mean:550950.25 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.16773692632300985) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008886336348950863 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.01893875896348618) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008737228916692806) - actor/ppo_kl:np.float64(0.0003030399058793086) - actor/pg_clipfrac_lower:np.float64(1.7526981254276406e-05) - actor/grad_norm:np.float64(0.5297526121139526) - perf/mfu/actor:np.float64(0.2855006008285116) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(105.6338472366333) - actor/lr:np.float64(1e-06) - training/global_step:333 - training/epoch:0 - critic/score/mean:0.6699438095092773 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.7044132351875305 - critic/rewards/max:1.292157530784607 - critic/rewards/min:-0.47371724247932434 - critic/advantages/mean:-0.07778304070234299 - critic/advantages/max:2.474221706390381 - critic/advantages/min:-2.474822521209717 - critic/returns/mean:-0.07778304070234299 - critic/returns/max:2.474221706390381 - critic/returns/min:-2.474822521209717 - response_length/mean:1552.5576171875 - response_length/max:8192.0 - response_length/min:133.0 - response_length/clip_ratio:0.0744381994009018 - response_length_non_aborted/mean:1552.5576171875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:133.0 - response_length_non_aborted/clip_ratio:0.0744381994009018 - response/aborted_ratio:0.0 - prompt_length/mean:244.7640380859375 - prompt_length/max:555.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.886586874723434e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0284692076966166) - timing_s/agent_loop/generate_sequences/max:np.float64(39.79041742812842) - timing_s/agent_loop/generate_sequences/mean:np.float64(10.025327218528219) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(39.79041742812842) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:42.200986173935235 - timing_s/reward:0.00017473474144935608 - timing_s/old_log_prob:13.460509246215224 - timing_s/ref:24.520387058146298 - timing_s/adv:0.08526104968041182 - timing_s/update_actor:23.691792597062886 - timing_s/update_weights:30.126394053921103 - timing_s/step:134.51559930481017 - timing_s/stop_profile:6.565917283296585e-05 - timing_per_token_ms/adv:6.662617493446617e-05 - timing_per_token_ms/update_actor:0.018513653350501164 - timing_per_token_ms/gen:0.038176392681100894 - timing_per_token_ms/ref:0.0191611480707844 - perf/total_num_tokens:2203801 - perf/time_per_step:134.51559930481017 - perf/throughput:4095.8093548061715 - frontier/active_count:53.0 - frontier/completed_count:11.0 - frontier/blacklisted_count:1197.0 - frontier/mean_score:2.7258987090245643 - frontier/mean_frontier_pct:0.518051815756838 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.7418326810999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:96.0 - frontier/cluster_1/score:2.6368141525699995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.556595861099999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.9167461327989992 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.5524798950999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9661587127989995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:144.0 - frontier/cluster_10/score:4.844668025299999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.8319299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:64.0 - frontier/cluster_14/score:2.6664119899999994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:3.0302357929999992 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:80.0 - frontier/cluster_17/score:2.6097863929999994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:4.102381207805147 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:64.0 - frontier/cluster_19/score:3.6864119899999994 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:96.0 - frontier/cluster_22/score:1.6380142999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.3783039235500563 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:1.8158722690999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.3035968967699993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:3.2526289632715093 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.9620063266386998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9256379265699992 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:80.0 - frontier/cluster_33/score:3.2536463929999995 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.7598999999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:4.1717931297535085 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.5517524750999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.213023637629999 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.725316634497929 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.105280896769999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.1957844129999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:128.0 - frontier/cluster_44/score:2.718200953430056 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.5502187127989995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.280158712798999 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:64.0 - frontier/cluster_48/score:2.5677856999999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.353187433738999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.736551989999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7118456999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.6656463929999994 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.9429999999999998 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:333.0 - cluster/prob_snapshot/cluster_0:0.025899941326063443 - cluster/prob_snapshot/cluster_1:0.018251305619368382 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.017696056569125096 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013267191937042781 - cluster/prob_snapshot/cluster_6:0.010745840773714845 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.013609212286919355 - cluster/prob_snapshot/cluster_10:0.03353346562856969 - cluster/prob_snapshot/cluster_11:0.02019514470059806 - cluster/prob_snapshot/cluster_12:0.01960184409366931 - cluster/prob_snapshot/cluster_13:0.01985061093350168 - cluster/prob_snapshot/cluster_14:0.018456173746339257 - cluster/prob_snapshot/cluster_15:0.02097446249781682 - cluster/prob_snapshot/cluster_16:0.012170332026845093 - cluster/prob_snapshot/cluster_17:0.018064226867671722 - cluster/prob_snapshot/cluster_18:0.02839555951178006 - cluster/prob_snapshot/cluster_19:0.025516334476139324 - cluster/prob_snapshot/cluster_20:0.020569656983712905 - cluster/prob_snapshot/cluster_21:0.017423117732411043 - cluster/prob_snapshot/cluster_22:0.011337886505598964 - cluster/prob_snapshot/cluster_23:0.016461968592723066 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014273083956959629 - cluster/prob_snapshot/cluster_26:0.012568970671208582 - cluster/prob_snapshot/cluster_27:0.015944867007710632 - cluster/prob_snapshot/cluster_28:0.022513807132450794 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013580470606817479 - cluster/prob_snapshot/cluster_31:0.02013971297389306 - cluster/prob_snapshot/cluster_32:0.01332873849900307 - cluster/prob_snapshot/cluster_33:0.02252084950246493 - cluster/prob_snapshot/cluster_34:0.019103272155073726 - cluster/prob_snapshot/cluster_35:0.02887600983091709 - cluster/prob_snapshot/cluster_36:0.017662531977324644 - cluster/prob_snapshot/cluster_37:0.015317943706386875 - cluster/prob_snapshot/cluster_38:0.018863895567797214 - cluster/prob_snapshot/cluster_39:0.01995084445068622 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02149390415919008 - cluster/prob_snapshot/cluster_42:0.015198618513499693 - cluster/prob_snapshot/cluster_43:0.01658421123174062 - cluster/prob_snapshot/cluster_44:0.01881464277167841 - cluster/prob_snapshot/cluster_45:0.016356039024036817 - cluster/prob_snapshot/cluster_46:0.024573641900298857 - cluster/prob_snapshot/cluster_47:0.0227043605211912 - cluster/prob_snapshot/cluster_48:0.017773509570276638 - cluster/prob_snapshot/cluster_49:0.023209845332689355 - cluster/prob_snapshot/cluster_50:0.0161014184803523 - cluster/prob_snapshot/cluster_51:0.018941663622406093 - cluster/prob_snapshot/cluster_52:0.012137840093606819 - cluster/prob_snapshot/cluster_53:0.019488718120801968 - cluster/prob_snapshot/cluster_54:0.021183862712026966 - cluster/prob_snapshot/cluster_55:0.022323458064941733 - cluster/prob_snapshot/cluster_56:0.01877065344746781 - cluster/prob_snapshot/cluster_57:0.01845087449352136 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.025778493074319717 - cluster/prob_snapshot/cluster_60:0.013448914017648556 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-12 22:56:52,227:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▏     | 334/800 [11:26:39<18:38:31, 144.02s/it]
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 22:56:54,820:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:334 - global_seqlen/min:420731 - global_seqlen/max:580670 - global_seqlen/minmax_diff:159939 - global_seqlen/balanced_min:476965 - global_seqlen/balanced_max:477121 - global_seqlen/mean:477048.75 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.15663599909748882) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009890716522932053 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05305847246199846) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0013426364279591023) - actor/ppo_kl:np.float64(0.0006729149946191152) - actor/pg_clipfrac_lower:np.float64(5.295994581426081e-05) - actor/grad_norm:np.float64(0.921592136224111) - perf/mfu/actor:np.float64(0.23668238710929224) - perf/max_memory_allocated_gb:np.float64(97.67490196228027) - perf/max_memory_reserved_gb:np.float64(104.267578125) - perf/cpu_memory_used_gb:np.float64(105.78994369506836) - actor/lr:np.float64(1e-06) - training/global_step:334 - training/epoch:0 - critic/score/mean:0.6605263352394104 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6796537637710571 - critic/rewards/max:1.2541258335113525 - critic/rewards/min:-0.13822850584983826 - critic/advantages/mean:-0.18212921917438507 - critic/advantages/max:2.474118709564209 - critic/advantages/min:-2.4747064113616943 - critic/returns/mean:-0.18212921917438507 - critic/returns/max:2.474118709564209 - critic/returns/min:-2.4747064113616943 - response_length/mean:1440.74609375 - response_length/max:8192.0 - response_length/min:98.0 - response_length/clip_ratio:0.051315788179636 - response_length_non_aborted/mean:1440.74609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:98.0 - response_length_non_aborted/clip_ratio:0.051315788179636 - response/aborted_ratio:0.0 - prompt_length/mean:229.54736328125 - prompt_length/max:358.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010671373456716537 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8606390804052353) - timing_s/agent_loop/generate_sequences/max:np.float64(37.570061691105366) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.537794024995492) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(37.570061691105366) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.498155129142106 - timing_s/reward:0.0010134810581803322 - timing_s/old_log_prob:10.922184932045639 - timing_s/ref:23.12718319799751 - timing_s/adv:0.0972017515450716 - timing_s/update_actor:24.390200971625745 - timing_s/update_weights:29.285277286544442 - timing_s/step:128.92698471993208 - timing_s/stop_profile:7.524527609348297e-05 - timing_per_token_ms/adv:7.657160106999133e-05 - timing_per_token_ms/update_actor:0.019213611988774228 - timing_per_token_ms/gen:0.03698573119476852 - timing_per_token_ms/ref:0.018218657766558122 - perf/total_num_tokens:1908195 - perf/time_per_step:128.92698471993208 - perf/throughput:3700.1466452992163 - frontier/active_count:53.0 - frontier/completed_count:11.0 - frontier/blacklisted_count:1230.0 - frontier/mean_score:2.7247985028917032 - frontier/mean_frontier_pct:0.53255956122286 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:96.0 - frontier/cluster_0/score:3.7418326810999996 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.1457699067989995 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.689617102769999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.9167461327989992 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.5524798950999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:1.9661587127989995 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.89126761771 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:32.0 - frontier/cluster_12/score:2.8319299999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.7664883929999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:3.0302357929999992 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:1.7582798950999994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.726850475099999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.7716668454636024 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:4.0804883929999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:96.0 - frontier/cluster_22/score:1.6380142999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.564812746485039 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:1.8158722690999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.3035968967699993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:112.0 - frontier/cluster_28/score:3.2526289632715093 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.9620063266386998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:1.9256379265699992 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.7775524750999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.8319299999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.8202551908274556 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.5517524750999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.449116546340999 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:144.0 - frontier/cluster_38/score:2.725316634497929 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:80.0 - frontier/cluster_41/score:3.105280896769999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.1957844129999997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:144.0 - frontier/cluster_44/score:2.2027406674010392 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.9540999999999997 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:96.0 - frontier/cluster_46/score:3.5502187127989995 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.280158712798999 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:2.6974499899999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.353187433738999 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.736551989999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7118456999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.7659524750999993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:32.0 - frontier/cluster_60/score:1.9429999999999998 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:334.0 - cluster/prob_snapshot/cluster_0:0.02591039908074051 - cluster/prob_snapshot/cluster_1:0.014858428839277007 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.018624310183390936 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0132725489004731 - cluster/prob_snapshot/cluster_6:0.01075017967800793 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.013614707349693937 - cluster/prob_snapshot/cluster_10:0.03386968546875601 - cluster/prob_snapshot/cluster_11:0.020203298999725144 - cluster/prob_snapshot/cluster_12:0.01960975883270941 - cluster/prob_snapshot/cluster_13:0.019858626118428896 - cluster/prob_snapshot/cluster_14:0.019156607048980666 - cluster/prob_snapshot/cluster_15:0.020982931466164048 - cluster/prob_snapshot/cluster_16:0.012175246105416657 - cluster/prob_snapshot/cluster_17:0.018882133453005573 - cluster/prob_snapshot/cluster_18:0.026116986379206807 - cluster/prob_snapshot/cluster_19:0.02825542767879149 - cluster/prob_snapshot/cluster_20:0.020577962501621958 - cluster/prob_snapshot/cluster_21:0.017430152755720994 - cluster/prob_snapshot/cluster_22:0.01134246446329158 - cluster/prob_snapshot/cluster_23:0.017760099794002917 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014278847074668194 - cluster/prob_snapshot/cluster_26:0.012574045710189094 - cluster/prob_snapshot/cluster_27:0.01595130514999929 - cluster/prob_snapshot/cluster_28:0.02252289765002664 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013585954064413666 - cluster/prob_snapshot/cluster_31:0.02014784489106214 - cluster/prob_snapshot/cluster_32:0.01333412031341039 - cluster/prob_snapshot/cluster_33:0.02615774154538266 - cluster/prob_snapshot/cluster_34:0.01960975883270941 - cluster/prob_snapshot/cluster_35:0.026453437398358252 - cluster/prob_snapshot/cluster_36:0.017669663670175577 - cluster/prob_snapshot/cluster_37:0.01695895902333221 - cluster/prob_snapshot/cluster_38:0.018871512341504085 - cluster/prob_snapshot/cluster_39:0.01995890010742455 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.021502582865211818 - cluster/prob_snapshot/cluster_42:0.015204755339133524 - cluster/prob_snapshot/cluster_43:0.016590907525388193 - cluster/prob_snapshot/cluster_44:0.015252924069004446 - cluster/prob_snapshot/cluster_45:0.013531206539355656 - cluster/prob_snapshot/cluster_46:0.024583564128124787 - cluster/prob_snapshot/cluster_47:0.02271352797950476 - cluster/prob_snapshot/cluster_48:0.018678549175719178 - cluster/prob_snapshot/cluster_49:0.023219216893247203 - cluster/prob_snapshot/cluster_50:0.01610791983424729 - cluster/prob_snapshot/cluster_51:0.01894931179692683 - cluster/prob_snapshot/cluster_52:0.012142741052740678 - cluster/prob_snapshot/cluster_53:0.01949658718237674 - cluster/prob_snapshot/cluster_54:0.02119241623099313 - cluster/prob_snapshot/cluster_55:0.022332471724279716 - cluster/prob_snapshot/cluster_56:0.01877823257231642 - cluster/prob_snapshot/cluster_57:0.019152896074213232 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02578890179120142 - cluster/prob_snapshot/cluster_60:0.013454344355953144 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▏     | 335/800 [11:29:15<19:02:46, 147.46s/it]
[36m(TaskRunner pid=2823680)[0m step:335 - global_seqlen/min:536291 - global_seqlen/max:634429 - global_seqlen/minmax_diff:98138 - global_seqlen/balanced_min:576752 - global_seqlen/balanced_max:577007 - global_seqlen/mean:576832.0 - frontier/skipped_zero_acc_count:36.0 - actor/entropy:np.float64(0.2072391422951351) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010924987494945526 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.041883337165927514) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0018095008489699862) - actor/ppo_kl:np.float64(0.0028077075313375317) - actor/pg_clipfrac_lower:np.float64(9.941881513520417e-05) - actor/grad_norm:np.float64(0.40472189833720523) - perf/mfu/actor:np.float64(0.26010574652690976) - perf/max_memory_allocated_gb:np.float64(112.75414419174194) - perf/max_memory_reserved_gb:np.float64(119.345703125) - perf/cpu_memory_used_gb:np.float64(105.27887344360352) - actor/lr:np.float64(1e-06) - training/global_step:335 - training/epoch:0 - critic/score/mean:0.64673912525177 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6955382227897644 - critic/rewards/max:1.4983044862747192 - critic/rewards/min:-0.10993857681751251 - critic/advantages/mean:-0.1137382835149765 - critic/advantages/max:2.4537644386291504 - critic/advantages/min:-2.474839448928833 - critic/returns/mean:-0.1137382835149765 - critic/returns/max:2.4537644386291504 - critic/returns/min:-2.474839448928833 - response_length/mean:1864.5054931640625 - response_length/max:8192.0 - response_length/min:82.0 - response_length/clip_ratio:0.11005435138940811 - response_length_non_aborted/mean:1864.5054931640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:82.0 - response_length_non_aborted/clip_ratio:0.11005435138940811 - response/aborted_ratio:0.0 - prompt_length/mean:235.95652770996094 - prompt_length/max:728.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010304246097803116 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9445822453126311) - timing_s/agent_loop/generate_sequences/max:np.float64(45.24101135600358) - timing_s/agent_loop/generate_sequences/mean:np.float64(10.891445333138108) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.24101135600358) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.11815500166267 - timing_s/reward:0.0002187788486480713 - timing_s/old_log_prob:11.540726888924837 - timing_s/ref:32.005435296334326 - timing_s/adv:0.08849716186523438 - timing_s/update_actor:27.300018776208162 - timing_s/update_weights:35.709138796664774 - timing_s/step:154.18164197076112 - timing_s/stop_profile:7.400289177894592e-05 - timing_per_token_ms/adv:5.7244887812744595e-05 - timing_per_token_ms/update_actor:0.01765917097442861 - timing_per_token_ms/gen:0.03433577137664921 - timing_per_token_ms/ref:0.020702896164362345 - perf/total_num_tokens:2307328 - perf/time_per_step:154.18164197076112 - perf/throughput:3741.249558811872 - frontier/active_count:51.0 - frontier/completed_count:13.0 - frontier/blacklisted_count:1266.0 - frontier/mean_score:2.737937903333652 - frontier/mean_frontier_pct:0.5301767130148168 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:4.119282876769999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.4020389347592994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.689617102769999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:112.0 - frontier/cluster_5/score:1.9167461327989992 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:112.0 - frontier/cluster_6/score:1.5524798950999994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:2.276311098959299 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.89126761771 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.8823509999999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.7664883929999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:80.0 - frontier/cluster_15/score:3.021165055099999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.726850475099999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.7716668454636024 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7563418750999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:96.0 - frontier/cluster_22/score:1.6380142999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.564812746485039 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:1.8158722690999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.3035968967699993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.9620063266386998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:112.0 - frontier/cluster_32/score:2.2479465485989993 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.7775524750999994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.8319299999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.8202551908274556 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:80.0 - frontier/cluster_36/score:2.5517524750999994 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:112.0 - frontier/cluster_37/score:2.449116546340999 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:2.2077216441485503 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.073696627738999 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.4370490890999994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:1.8419184671807274 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.9540999999999997 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.3851530989592993 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.280158712798999 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:2.6974499899999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.2472312036172992 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:64.0 - frontier/cluster_51/score:2.736551989999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:80.0 - frontier/cluster_52/score:1.7535856999999995 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.815586392999999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7118456999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.7659524750999993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:335.0 - cluster/prob_snapshot/cluster_0:0.029500396041613 - cluster/prob_snapshot/cluster_1:0.01720229030212583 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.019261791944287718 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.013726848026794983 - cluster/prob_snapshot/cluster_6:0.011118141949018901 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.016301885775291486 - cluster/prob_snapshot/cluster_10:0.03502899319725903 - cluster/prob_snapshot/cluster_11:0.020894827141999892 - cluster/prob_snapshot/cluster_12:0.020642062847990944 - cluster/prob_snapshot/cluster_13:0.020538356633133092 - cluster/prob_snapshot/cluster_14:0.019812308520559596 - cluster/prob_snapshot/cluster_15:0.021636184816328136 - cluster/prob_snapshot/cluster_16:0.015259773508674353 - cluster/prob_snapshot/cluster_17:0.019528440111592292 - cluster/prob_snapshot/cluster_18:0.027010931030170663 - cluster/prob_snapshot/cluster_19:0.026901180690469116 - cluster/prob_snapshot/cluster_20:0.021282314804715607 - cluster/prob_snapshot/cluster_21:0.0180267603273306 - cluster/prob_snapshot/cluster_22:0.011730699740076032 - cluster/prob_snapshot/cluster_23:0.01836800095001353 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014767590254260042 - cluster/prob_snapshot/cluster_26:0.013004436136572583 - cluster/prob_snapshot/cluster_27:0.016497294021291384 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014050980571981656 - cluster/prob_snapshot/cluster_31:0.02083747492368926 - cluster/prob_snapshot/cluster_32:0.016098752003175492 - cluster/prob_snapshot/cluster_33:0.027053081183588654 - cluster/prob_snapshot/cluster_34:0.02028097099940673 - cluster/prob_snapshot/cluster_35:0.027358898255078576 - cluster/prob_snapshot/cluster_36:0.018274469335459367 - cluster/prob_snapshot/cluster_37:0.017539438351410166 - cluster/prob_snapshot/cluster_38:0.01581067986840752 - cluster/prob_snapshot/cluster_39:0.020642062847990944 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.022012391608602234 - cluster/prob_snapshot/cluster_42:0.01745301681191544 - cluster/prob_snapshot/cluster_43:0.017158789011468258 - cluster/prob_snapshot/cluster_44:0.013190966943449888 - cluster/prob_snapshot/cluster_45:0.01399435912255624 - cluster/prob_snapshot/cluster_46:0.024242898598674893 - cluster/prob_snapshot/cluster_47:0.02349097743508061 - cluster/prob_snapshot/cluster_48:0.019317887454682842 - cluster/prob_snapshot/cluster_49:0.023255165865304253 - cluster/prob_snapshot/cluster_50:0.016659269387557327 - cluster/prob_snapshot/cluster_51:0.019597917867870594 - cluster/prob_snapshot/cluster_52:0.01255836857785127 - cluster/prob_snapshot/cluster_53:0.020163925655915644 - cluster/prob_snapshot/cluster_54:0.021917800349038962 - cluster/prob_snapshot/cluster_55:0.02309687820483047 - cluster/prob_snapshot/cluster_56:0.019420982862064333 - cluster/prob_snapshot/cluster_57:0.01980847052478005 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.026671616606599537 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▏     | 336/800 [11:31:39<18:53:15, 146.54s/it]
[36m(TaskRunner pid=2823680)[0m step:336 - global_seqlen/min:493592 - global_seqlen/max:709839 - global_seqlen/minmax_diff:216247 - global_seqlen/balanced_min:557009 - global_seqlen/balanced_max:557093 - global_seqlen/mean:557051.5 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.18535481732000003) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0030121421441435814 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.018763098065392114) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009225006137553878) - actor/ppo_kl:np.float64(-0.00027885174573973677) - actor/pg_clipfrac_lower:np.float64(2.1790183072053797e-05) - actor/grad_norm:np.float64(0.4926167401400479) - perf/mfu/actor:np.float64(0.2729726546189255) - perf/max_memory_allocated_gb:np.float64(112.75414419174194) - perf/max_memory_reserved_gb:np.float64(119.345703125) - perf/cpu_memory_used_gb:np.float64(107.94802856445312) - actor/lr:np.float64(1e-06) - training/global_step:336 - training/epoch:0 - critic/score/mean:0.578125 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6204540729522705 - critic/rewards/max:1.513434648513794 - critic/rewards/min:-0.08442679792642593 - critic/advantages/mean:-0.03083660453557968 - critic/advantages/max:2.473855495452881 - critic/advantages/min:-2.474719285964966 - critic/returns/mean:-0.03083660453557968 - critic/returns/max:2.473855495452881 - critic/returns/min:-2.474719285964966 - response_length/mean:1654.3082275390625 - response_length/max:8192.0 - response_length/min:143.0 - response_length/clip_ratio:0.07386363297700882 - response_length_non_aborted/mean:1654.3082275390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:143.0 - response_length_non_aborted/clip_ratio:0.07386363297700882 - response/aborted_ratio:0.0 - prompt_length/mean:255.65908813476562 - prompt_length/max:477.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010092929005622864 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1444614324718714) - timing_s/agent_loop/generate_sequences/max:np.float64(40.010992149822414) - timing_s/agent_loop/generate_sequences/mean:np.float64(10.094617543247296) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(40.010992149822414) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:203 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:41.52097429241985 - timing_s/reward:0.0002491772174835205 - timing_s/old_log_prob:11.822890755720437 - timing_s/ref:27.7532027810812 - timing_s/adv:0.06694699730724096 - timing_s/update_actor:24.982376236468554 - timing_s/update_weights:36.73068517073989 - timing_s/step:143.3541603963822 - timing_s/stop_profile:8.83694738149643e-05 - timing_per_token_ms/adv:4.978889699240822e-05 - timing_per_token_ms/update_actor:0.018579548106612182 - timing_per_token_ms/gen:0.03565155228507165 - timing_per_token_ms/ref:0.020640228987943182 - perf/total_num_tokens:2228206 - perf/time_per_step:143.3541603963822 - perf/throughput:3885.8411814468564 - frontier/active_count:50.0 - frontier/completed_count:14.0 - frontier/blacklisted_count:1306.0 - frontier/mean_score:2.7329376029059067 - frontier/mean_frontier_pct:0.534820740661524 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:4.119282876769999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:112.0 - frontier/cluster_1/score:2.4020389347592994 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.689617102769999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:1.6417222929592994 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.3867359265699994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:112.0 - frontier/cluster_9/score:2.276311098959299 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.89126761771 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:48.0 - frontier/cluster_12/score:2.9176456999999996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.7664883929999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:3.0148155385699993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:96.0 - frontier/cluster_17/score:2.808795332569999 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.7716668454636024 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7563418750999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:96.0 - frontier/cluster_22/score:1.6380142999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.564812746485039 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:1.8158722690999995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.3035968967699993 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.9620063266386998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.8735625840192995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.5442867325699994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.8319299999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:144.0 - frontier/cluster_35/score:3.8202551908274556 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.6862267325699993 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:128.0 - frontier/cluster_37/score:2.6143815824386993 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:2.2077216441485503 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.8823509999999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.051587639417299 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.4370490890999994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:160.0 - frontier/cluster_44/score:1.8419184671807274 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.9540999999999997 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.2696071692715094 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:96.0 - frontier/cluster_47/score:3.196111098959299 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:80.0 - frontier/cluster_48/score:2.7882149929999995 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:112.0 - frontier/cluster_49/score:3.2472312036172992 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.215586392999999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.8709104750999987 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7118456999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:80.0 - frontier/cluster_57/score:2.7659524750999993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:336.0 - cluster/prob_snapshot/cluster_0:0.03014545866242979 - cluster/prob_snapshot/cluster_1:0.017578439640958024 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.01968297483199144 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.012014341573065346 - cluster/prob_snapshot/cluster_6:0.010148317510765677 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.016658346656278716 - cluster/prob_snapshot/cluster_10:0.03579494542801973 - cluster/prob_snapshot/cluster_11:0.021351718362670954 - cluster/prob_snapshot/cluster_12:0.021351718362670954 - cluster/prob_snapshot/cluster_13:0.020987453185543795 - cluster/prob_snapshot/cluster_14:0.020245529133621043 - cluster/prob_snapshot/cluster_15:0.02206282013438122 - cluster/prob_snapshot/cluster_16:0.015593447316940894 - cluster/prob_snapshot/cluster_17:0.02055513693092322 - cluster/prob_snapshot/cluster_18:0.027601558421628248 - cluster/prob_snapshot/cluster_19:0.027489408255101884 - cluster/prob_snapshot/cluster_20:0.021747678922051955 - cluster/prob_snapshot/cluster_21:0.018420937722277472 - cluster/prob_snapshot/cluster_22:0.011987205988591284 - cluster/prob_snapshot/cluster_23:0.01876963999293579 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01509050186734904 - cluster/prob_snapshot/cluster_26:0.013288794205686948 - cluster/prob_snapshot/cluster_27:0.016858027745094557 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014358222628665337 - cluster/prob_snapshot/cluster_31:0.0212931120672218 - cluster/prob_snapshot/cluster_32:0.013710979584950335 - cluster/prob_snapshot/cluster_33:0.025937560585367135 - cluster/prob_snapshot/cluster_34:0.02072443949681716 - cluster/prob_snapshot/cluster_35:0.02795713437998302 - cluster/prob_snapshot/cluster_36:0.019658163653013955 - cluster/prob_snapshot/cluster_37:0.019132391311523923 - cluster/prob_snapshot/cluster_38:0.01615639992512892 - cluster/prob_snapshot/cluster_39:0.02109342706496645 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.022331923247516334 - cluster/prob_snapshot/cluster_42:0.017834648595772606 - cluster/prob_snapshot/cluster_43:0.017533987140814285 - cluster/prob_snapshot/cluster_44:0.013479403739201604 - cluster/prob_snapshot/cluster_45:0.01430036308126628 - cluster/prob_snapshot/cluster_46:0.023927419095078988 - cluster/prob_snapshot/cluster_47:0.023389565100651433 - cluster/prob_snapshot/cluster_48:0.02040452727523173 - cluster/prob_snapshot/cluster_49:0.023763668809449207 - cluster/prob_snapshot/cluster_50:0.017023544902939294 - cluster/prob_snapshot/cluster_51:0.01621395520076409 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.021009703785753373 - cluster/prob_snapshot/cluster_54:0.02239706014323789 - cluster/prob_snapshot/cluster_55:0.02360191999364164 - cluster/prob_snapshot/cluster_56:0.01984564665593916 - cluster/prob_snapshot/cluster_57:0.02024160721532016 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.027254824468806026 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-12 23:04:04,433:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▏     | 337/800 [11:34:02<18:41:34, 145.35s/it]
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 23:04:06,384:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:337 - global_seqlen/min:440090 - global_seqlen/max:734100 - global_seqlen/minmax_diff:294010 - global_seqlen/balanced_min:581112 - global_seqlen/balanced_max:581216 - global_seqlen/mean:581163.75 - frontier/skipped_zero_acc_count:40.0 - actor/entropy:np.float64(0.17007223882881756) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.002744679804891348 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.025630121410358697) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0032122347949387568) - actor/ppo_kl:np.float64(-0.0025873099783304747) - actor/pg_clipfrac_lower:np.float64(0.000555702605693527) - actor/grad_norm:np.float64(0.403814207423817) - perf/mfu/actor:np.float64(0.32505647897656703) - perf/max_memory_allocated_gb:np.float64(112.75414419174194) - perf/max_memory_reserved_gb:np.float64(119.345703125) - perf/cpu_memory_used_gb:np.float64(118.74651336669922) - actor/lr:np.float64(1e-06) - training/global_step:337 - training/epoch:0 - critic/score/mean:0.5724431872367859 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6068536639213562 - critic/rewards/max:1.6526684761047363 - critic/rewards/min:-0.10789791494607925 - critic/advantages/mean:-0.09838192909955978 - critic/advantages/max:2.4733121395111084 - critic/advantages/min:-2.474835157394409 - critic/returns/mean:-0.09838192909955978 - critic/returns/max:2.4733121395111084 - critic/returns/min:-2.474835157394409 - response_length/mean:1613.508544921875 - response_length/max:8192.0 - response_length/min:198.0 - response_length/clip_ratio:0.07386363297700882 - response_length_non_aborted/mean:1613.508544921875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:198.0 - response_length_non_aborted/clip_ratio:0.07386363297700882 - response/aborted_ratio:0.0 - prompt_length/mean:242.19317626953125 - prompt_length/max:527.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.500926196575165e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5900062508881092) - timing_s/agent_loop/generate_sequences/max:np.float64(44.536634172312915) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.130660177192112) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(44.536634172312915) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:207 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:46.05947459395975 - timing_s/reward:0.00019539520144462585 - timing_s/old_log_prob:10.775825524702668 - timing_s/ref:28.130417469888926 - timing_s/adv:0.08617745712399483 - timing_s/update_actor:22.104511454701424 - timing_s/update_weights:33.677741792052984 - timing_s/step:141.2863900726661 - timing_s/stop_profile:6.144121289253235e-05 - timing_per_token_ms/adv:6.59648910100434e-05 - timing_per_token_ms/update_actor:0.016919989723549675 - timing_per_token_ms/gen:0.04054852461371037 - timing_per_token_ms/ref:0.021532544407736696 - perf/total_num_tokens:2324655 - perf/time_per_step:141.2863900726661 - perf/throughput:4113.373904599708 - frontier/active_count:50.0 - frontier/completed_count:14.0 - frontier/blacklisted_count:1346.0 - frontier/mean_score:2.693442666692795 - frontier/mean_frontier_pct:0.553049257796356 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:4.119282876769999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.5814272543315093 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:96.0 - frontier/cluster_3/score:2.689617102769999 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:1.6417222929592994 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.3867359265699994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.8934177692715093 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.89126761771 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9423519899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.7664883929999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:3.0148155385699993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:2.8661567327989994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.7716668454636024 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7563418750999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:80.0 - frontier/cluster_21/score:2.5171636690999994 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:96.0 - frontier/cluster_22/score:1.6380142999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:128.0 - frontier/cluster_23/score:2.564812746485039 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.1711105883699995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:96.0 - frontier/cluster_27/score:2.5125178277389995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:128.0 - frontier/cluster_30/score:1.9620063266386998 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.8735625840192995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.5442867325699994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.8319299999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.9741786335792186 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:96.0 - frontier/cluster_36/score:2.780358712798999 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.1300671077070894 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:2.2077216441485503 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:96.0 - frontier/cluster_41/score:3.051587639417299 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.4370490890999994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:1.5893429270265091 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:1.9540999999999997 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:112.0 - frontier/cluster_46/score:3.2696071692715094 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:2.537277769271509 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.2517504950999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.173061842532109 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:64.0 - frontier/cluster_50/score:2.3262142999999993 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.450910475099999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.8709104750999987 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:64.0 - frontier/cluster_54/score:3.060488392999999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.225128732569999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7118456999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:96.0 - frontier/cluster_57/score:2.8361667325699993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:337.0 - cluster/prob_snapshot/cluster_0:0.03058749256265369 - cluster/prob_snapshot/cluster_1:0.019168236148135086 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.019971593500243367 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01219051226343812 - cluster/prob_snapshot/cluster_6:0.010297126006938436 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.014059462209354437 - cluster/prob_snapshot/cluster_10:0.036319819821640044 - cluster/prob_snapshot/cluster_11:0.02166480642843976 - cluster/prob_snapshot/cluster_12:0.02184826153075561 - cluster/prob_snapshot/cluster_13:0.02129519989761935 - cluster/prob_snapshot/cluster_14:0.020542396741616155 - cluster/prob_snapshot/cluster_15:0.022386335345847993 - cluster/prob_snapshot/cluster_16:0.015822099745574653 - cluster/prob_snapshot/cluster_17:0.021282478132852 - cluster/prob_snapshot/cluster_18:0.02800629018099524 - cluster/prob_snapshot/cluster_19:0.027892495515505513 - cluster/prob_snapshot/cluster_20:0.022066573102511467 - cluster/prob_snapshot/cluster_21:0.018691050678206984 - cluster/prob_snapshot/cluster_22:0.012162978779951334 - cluster/prob_snapshot/cluster_23:0.01904486609796156 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.015311779422666974 - cluster/prob_snapshot/cluster_26:0.016121453894066714 - cluster/prob_snapshot/cluster_27:0.018656553256610076 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01456876250533147 - cluster/prob_snapshot/cluster_31:0.02160534076741766 - cluster/prob_snapshot/cluster_32:0.013912028699833408 - cluster/prob_snapshot/cluster_33:0.026317892535072468 - cluster/prob_snapshot/cluster_34:0.02102832954285416 - cluster/prob_snapshot/cluster_35:0.02208458839950829 - cluster/prob_snapshot/cluster_36:0.02064538998495131 - cluster/prob_snapshot/cluster_37:0.01581668794400247 - cluster/prob_snapshot/cluster_38:0.0163933071340208 - cluster/prob_snapshot/cluster_39:0.02166480642843976 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.022659384416481833 - cluster/prob_snapshot/cluster_42:0.01809616457952963 - cluster/prob_snapshot/cluster_43:0.017791094415549148 - cluster/prob_snapshot/cluster_44:0.011801572364471527 - cluster/prob_snapshot/cluster_45:0.0145100545421996 - cluster/prob_snapshot/cluster_46:0.02427827560395909 - cluster/prob_snapshot/cluster_47:0.018840406745222936 - cluster/prob_snapshot/cluster_48:0.01672024077545978 - cluster/prob_snapshot/cluster_49:0.02356138396239357 - cluster/prob_snapshot/cluster_50:0.017273167376206263 - cluster/prob_snapshot/cluster_51:0.01819909148546611 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.021317776766528407 - cluster/prob_snapshot/cluster_54:0.022725476438359756 - cluster/prob_snapshot/cluster_55:0.023948003589993227 - cluster/prob_snapshot/cluster_56:0.020136650640719234 - cluster/prob_snapshot/cluster_57:0.021059789151201436 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.027654471941242025 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▏     | 338/800 [11:36:16<18:13:30, 142.01s/it]
[36m(TaskRunner pid=2823680)[0m step:338 - global_seqlen/min:424661 - global_seqlen/max:563450 - global_seqlen/minmax_diff:138789 - global_seqlen/balanced_min:495093 - global_seqlen/balanced_max:495154 - global_seqlen/mean:495112.5 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.1766019093100818) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0107241440564394 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09056468850758392) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009836223347223013) - actor/ppo_kl:np.float64(0.0014899636766158993) - actor/pg_clipfrac_lower:np.float64(1.7317068155460554e-05) - actor/grad_norm:np.float64(0.4256969541311264) - perf/mfu/actor:np.float64(0.23969374836393287) - perf/max_memory_allocated_gb:np.float64(112.75414419174194) - perf/max_memory_reserved_gb:np.float64(119.345703125) - perf/cpu_memory_used_gb:np.float64(119.14186096191406) - actor/lr:np.float64(1e-06) - training/global_step:338 - training/epoch:0 - critic/score/mean:0.5676020383834839 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5901669859886169 - critic/rewards/max:1.586715579032898 - critic/rewards/min:-0.08460218459367752 - critic/advantages/mean:-0.11901834607124329 - critic/advantages/max:2.474674701690674 - critic/advantages/min:-2.474785327911377 - critic/returns/mean:-0.11901834607124329 - critic/returns/max:2.474674701690674 - critic/returns/min:-2.474785327911377 - response_length/mean:1553.0791015625 - response_length/max:8192.0 - response_length/min:166.0 - response_length/clip_ratio:0.05994898080825806 - response_length_non_aborted/mean:1553.0791015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:166.0 - response_length_non_aborted/clip_ratio:0.05994898080825806 - response/aborted_ratio:0.0 - prompt_length/mean:238.47959899902344 - prompt_length/max:434.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.0001295013353228569 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3554622754454613) - timing_s/agent_loop/generate_sequences/max:np.float64(36.98706013336778) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.77198947931447) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(36.98706013336778) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:173 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:38.65677858144045 - timing_s/reward:0.00022243615239858627 - timing_s/old_log_prob:11.28538988251239 - timing_s/ref:26.651123794727027 - timing_s/adv:0.08795825392007828 - timing_s/update_actor:25.058435606770217 - timing_s/update_weights:31.789281864650548 - timing_s/step:133.97844099905342 - timing_s/stop_profile:5.5516138672828674e-05 - timing_per_token_ms/adv:6.262237015715585e-05 - timing_per_token_ms/update_actor:0.017840493190693186 - timing_per_token_ms/gen:0.03174797479450832 - timing_per_token_ms/ref:0.018974416441850335 - perf/total_num_tokens:1980450 - perf/time_per_step:133.97844099905342 - perf/throughput:3695.463959037246 - frontier/active_count:49.0 - frontier/completed_count:15.0 - frontier/blacklisted_count:1376.0 - frontier/mean_score:2.6941388338615058 - frontier/mean_frontier_pct:0.563716445065856 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:112.0 - frontier/cluster_0/score:3.783498013738999 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.5814272543315093 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:112.0 - frontier/cluster_3/score:2.7827319719389987 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.3867359265699994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.8934177692715093 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:160.0 - frontier/cluster_10/score:4.89126761771 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9423519899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.7664883929999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:3.0148155385699993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:2.9063097129592994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:160.0 - frontier/cluster_18/score:3.7716668454636024 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:80.0 - frontier/cluster_19/score:3.7563418750999995 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.662014568369999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:96.0 - frontier/cluster_22/score:1.6380142999999998 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.6953689225395268 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.1711105883699995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:112.0 - frontier/cluster_27/score:2.6587624794172995 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:144.0 - frontier/cluster_30/score:1.6734044286470897 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.909637332569999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.8735625840192995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.5442867325699994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.8319299999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.9741786335792186 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.1300671077070894 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:2.2077216441485503 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:3.0361113475921093 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.4370490890999994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:1.5893429270265091 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:48.0 - frontier/cluster_45/score:2.2678699999999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:3.1887250184900564 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:2.537277769271509 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.2517504950999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:128.0 - frontier/cluster_49/score:3.173061842532109 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:2.5283500099999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.450910475099999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.8709104750999987 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:3.042341875099999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.157590112798999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:48.0 - frontier/cluster_56/score:2.7118456999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.2853167127989993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:338.0 - cluster/prob_snapshot/cluster_0:0.028660083959992453 - cluster/prob_snapshot/cluster_1:0.01955437047332822 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02107925831183713 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.015522779311464984 - cluster/prob_snapshot/cluster_6:0.01050455635010565 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.014342682893346044 - cluster/prob_snapshot/cluster_10:0.03705146403812314 - cluster/prob_snapshot/cluster_11:0.022101232886567436 - cluster/prob_snapshot/cluster_12:0.022288383598202188 - cluster/prob_snapshot/cluster_13:0.02172418082099556 - cluster/prob_snapshot/cluster_14:0.020956212829980933 - cluster/prob_snapshot/cluster_15:0.022837296635426912 - cluster/prob_snapshot/cluster_16:0.01614082786229761 - cluster/prob_snapshot/cluster_17:0.022015362525548064 - cluster/prob_snapshot/cluster_18:0.028570462589798427 - cluster/prob_snapshot/cluster_19:0.028454375588903926 - cluster/prob_snapshot/cluster_20:0.02251109294504754 - cluster/prob_snapshot/cluster_21:0.020164821219718576 - cluster/prob_snapshot/cluster_22:0.012407995774068023 - cluster/prob_snapshot/cluster_23:0.020417481215166886 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.015620227397179897 - cluster/prob_snapshot/cluster_26:0.01644621234718726 - cluster/prob_snapshot/cluster_27:0.020140186571546096 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012676076807729827 - cluster/prob_snapshot/cluster_31:0.022040569320181825 - cluster/prob_snapshot/cluster_32:0.01419227941109149 - cluster/prob_snapshot/cluster_33:0.026848053035809224 - cluster/prob_snapshot/cluster_34:0.021451934499263196 - cluster/prob_snapshot/cluster_35:0.022529471150999326 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.016135307042747345 - cluster/prob_snapshot/cluster_38:0.016723541931785164 - cluster/prob_snapshot/cluster_39:0.022101232886567436 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02299861287567689 - cluster/prob_snapshot/cluster_42:0.018460702570636364 - cluster/prob_snapshot/cluster_43:0.018149486924046273 - cluster/prob_snapshot/cluster_44:0.01203930901097129 - cluster/prob_snapshot/cluster_45:0.017179167102592232 - cluster/prob_snapshot/cluster_46:0.024154664922088655 - cluster/prob_snapshot/cluster_47:0.01921993711456465 - cluster/prob_snapshot/cluster_48:0.017057061484418285 - cluster/prob_snapshot/cluster_49:0.024036016006084262 - cluster/prob_snapshot/cluster_50:0.019152309133958625 - cluster/prob_snapshot/cluster_51:0.018565702886513165 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.021747212489395488 - cluster/prob_snapshot/cluster_54:0.023045809264003972 - cluster/prob_snapshot/cluster_55:0.023918817299610238 - cluster/prob_snapshot/cluster_56:0.020542293181155098 - cluster/prob_snapshot/cluster_57:0.017311326350946395 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02821155715133022 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826757)[0m WARNING:2026-04-12 23:08:42,006:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▏     | 339/800 [11:38:13<17:14:06, 134.59s/it]
[36m(TaskRunner pid=2823680)[0m step:339 - global_seqlen/min:369014 - global_seqlen/max:575325 - global_seqlen/minmax_diff:206311 - global_seqlen/balanced_min:460762 - global_seqlen/balanced_max:460849 - global_seqlen/mean:460806.75 - frontier/skipped_zero_acc_count:25.0 - actor/entropy:np.float64(0.1718760329262855) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009563709609210491 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.10884074398563826) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.001463977107680246) - actor/ppo_kl:np.float64(0.001169929755060541) - actor/pg_clipfrac_lower:np.float64(0.00013453336088367415) - actor/grad_norm:np.float64(0.49064830862558806) - perf/mfu/actor:np.float64(0.2191637035602347) - perf/max_memory_allocated_gb:np.float64(112.75414419174194) - perf/max_memory_reserved_gb:np.float64(119.345703125) - perf/cpu_memory_used_gb:np.float64(118.61836624145508) - actor/lr:np.float64(1e-06) - training/global_step:339 - training/epoch:0 - critic/score/mean:0.6808252334594727 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.7015175223350525 - critic/rewards/max:1.3418807983398438 - critic/rewards/min:-0.2785368263721466 - critic/advantages/mean:-0.09267175197601318 - critic/advantages/max:2.4745707511901855 - critic/advantages/min:-2.4748423099517822 - critic/returns/mean:-0.09267175197601318 - critic/returns/max:2.4745707511901855 - critic/returns/min:-2.4748423099517822 - response_length/mean:1366.548583984375 - response_length/max:8192.0 - response_length/min:133.0 - response_length/clip_ratio:0.05339805781841278 - response_length_non_aborted/mean:1366.548583984375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:133.0 - response_length_non_aborted/clip_ratio:0.05339805781841278 - response/aborted_ratio:0.0 - prompt_length/mean:241.10679626464844 - prompt_length/max:478.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010073278099298477 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3565913550555706) - timing_s/agent_loop/generate_sequences/max:np.float64(35.785589046776295) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.207588357628993) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(35.785589046776295) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:188 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:38.49444126710296 - timing_s/reward:0.00016516167670488358 - timing_s/old_log_prob:10.71736648492515 - timing_s/ref:15.294008227996528 - timing_s/adv:0.09933398012071848 - timing_s/update_actor:25.508727244101465 - timing_s/update_weights:26.47623754106462 - timing_s/step:117.01885748375207 - timing_s/stop_profile:5.716551095247269e-05 - timing_per_token_ms/adv:7.498556672166129e-05 - timing_per_token_ms/update_actor:0.01925611322955811 - timing_per_token_ms/gen:0.03418579980311727 - timing_per_token_ms/ref:0.011545192018162892 - perf/total_num_tokens:1843227 - perf/time_per_step:117.01885748375207 - perf/throughput:3937.884541933615 - frontier/active_count:48.0 - frontier/completed_count:16.0 - frontier/blacklisted_count:1401.0 - frontier/mean_score:2.7054608538500884 - frontier/mean_frontier_pct:0.5723794851905083 - frontier/batch_easy_count:5.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:3.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.5814272543315093 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:3.447912380357299 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.3867359265699994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.8934177692715093 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:4.923887332396999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9423519899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.7664883929999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:3.0148155385699993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:112.0 - frontier/cluster_17/score:2.9063097129592994 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:176.0 - frontier/cluster_18/score:4.140166791824521 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.529439312569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.662014568369999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:112.0 - frontier/cluster_22/score:1.44661001 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.6953689225395268 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:112.0 - frontier/cluster_26/score:2.1711105883699995 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.1611337355921094 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:144.0 - frontier/cluster_30/score:1.6734044286470897 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:96.0 - frontier/cluster_31/score:2.936746132798999 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:128.0 - frontier/cluster_32/score:1.8735625840192995 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.5442867325699994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.8319299999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.981925043505453 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.1300671077070894 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:160.0 - frontier/cluster_38/score:2.4454051509039854 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:112.0 - frontier/cluster_41/score:3.0252779433144763 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.4370490890999994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:2.0125400489185563 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.4875089999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:3.132107512943039 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:2.537277769271509 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.2517504950999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.7211432897724763 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:2.5283500099999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.450910475099999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:80.0 - frontier/cluster_53/score:2.8709104750999987 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:3.042341875099999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:96.0 - frontier/cluster_55/score:3.157590112798999 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.798291989999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.2853167127989993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:339.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.019878215716448783 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.026550562659921093 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01577985627780788 - cluster/prob_snapshot/cluster_6:0.010678525162331729 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.014580215962228636 - cluster/prob_snapshot/cluster_10:0.037916270695849526 - cluster/prob_snapshot/cluster_11:0.02246725741019898 - cluster/prob_snapshot/cluster_12:0.022657507575625516 - cluster/prob_snapshot/cluster_13:0.0220839608829089 - cluster/prob_snapshot/cluster_14:0.021303274365307177 - cluster/prob_snapshot/cluster_15:0.023215511310821547 - cluster/prob_snapshot/cluster_16:0.01640814049863958 - cluster/prob_snapshot/cluster_17:0.02237996492679632 - cluster/prob_snapshot/cluster_18:0.03188125036329093 - cluster/prob_snapshot/cluster_19:0.027178358753150157 - cluster/prob_snapshot/cluster_20:0.022883905272483095 - cluster/prob_snapshot/cluster_21:0.02049877630353422 - cluster/prob_snapshot/cluster_22:0.011139584037513712 - cluster/prob_snapshot/cluster_23:0.02075562066982507 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.015878918227750893 - cluster/prob_snapshot/cluster_26:0.01671858253896878 - cluster/prob_snapshot/cluster_27:0.01664175603481013 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012886008760344384 - cluster/prob_snapshot/cluster_31:0.02261433981309023 - cluster/prob_snapshot/cluster_32:0.014427321607033767 - cluster/prob_snapshot/cluster_33:0.027292690937835002 - cluster/prob_snapshot/cluster_34:0.021807205815862013 - cluster/prob_snapshot/cluster_35:0.022962238879914654 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.016402528247296516 - cluster/prob_snapshot/cluster_38:0.01883078092641143 - cluster/prob_snapshot/cluster_39:0.02246725741019898 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.02329607679569994 - cluster/prob_snapshot/cluster_42:0.018766435282426733 - cluster/prob_snapshot/cluster_43:0.01845006550894361 - cluster/prob_snapshot/cluster_44:0.015497514083833982 - cluster/prob_snapshot/cluster_45:0.019155000558562216 - cluster/prob_snapshot/cluster_46:0.024118715212648814 - cluster/prob_snapshot/cluster_47:0.019538243715951687 - cluster/prob_snapshot/cluster_48:0.017339548114753855 - cluster/prob_snapshot/cluster_49:0.028654570413245477 - cluster/prob_snapshot/cluster_50:0.019469495729981596 - cluster/prob_snapshot/cluster_51:0.018873174537067595 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.022107373985027833 - cluster/prob_snapshot/cluster_54:0.02342747717370177 - cluster/prob_snapshot/cluster_55:0.024314943332617234 - cluster/prob_snapshot/cluster_56:0.02154817716497515 - cluster/prob_snapshot/cluster_57:0.017598023930830566 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.028678776415532247 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  42%|████▎     | 340/800 [11:40:14<16:41:07, 130.58s/it]
[36m(TaskRunner pid=2823680)[0m step:340 - global_seqlen/min:446952 - global_seqlen/max:565706 - global_seqlen/minmax_diff:118754 - global_seqlen/balanced_min:503872 - global_seqlen/balanced_max:504135 - global_seqlen/mean:504004.5 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.17574685152906638) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0066501665860414505 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.011352869274560362) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0015070459397135822) - actor/ppo_kl:np.float64(-0.0012692069117074074) - actor/pg_clipfrac_lower:np.float64(0.00017756775689401996) - actor/grad_norm:np.float64(0.4235221537259909) - perf/mfu/actor:np.float64(0.2213060031237688) - perf/max_memory_allocated_gb:np.float64(112.75414419174194) - perf/max_memory_reserved_gb:np.float64(119.345703125) - perf/cpu_memory_used_gb:np.float64(118.87116241455078) - actor/lr:np.float64(1e-06) - training/global_step:340 - training/epoch:0 - critic/score/mean:0.6225961446762085 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6512881517410278 - critic/rewards/max:1.7636749744415283 - critic/rewards/min:-0.05855169892311096 - critic/advantages/mean:-0.0955498218536377 - critic/advantages/max:2.474721908569336 - critic/advantages/min:-2.474858045578003 - critic/returns/mean:-0.0955498218536377 - critic/returns/max:2.474721908569336 - critic/returns/min:-2.474858045578003 - response_length/mean:1514.41943359375 - response_length/max:8192.0 - response_length/min:103.0 - response_length/clip_ratio:0.06610576808452606 - response_length_non_aborted/mean:1514.41943359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:103.0 - response_length_non_aborted/clip_ratio:0.06610576808452606 - response/aborted_ratio:0.0 - prompt_length/mean:246.25 - prompt_length/max:527.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.842069655656815e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8854983560740948) - timing_s/agent_loop/generate_sequences/max:np.float64(38.3892383472994) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.025119013750555) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(38.3892383472994) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:197 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.30907657369971 - timing_s/reward:0.00037079770117998123 - timing_s/old_log_prob:12.531480667181313 - timing_s/ref:15.3428859366104 - timing_s/adv:0.09911922831088305 - timing_s/update_actor:27.659710029140115 - timing_s/update_weights:23.792850797995925 - timing_s/step:120.1777337808162 - timing_s/stop_profile:6.640609353780746e-05 - timing_per_token_ms/adv:6.766385731422027e-05 - timing_per_token_ms/update_actor:0.018881933451846206 - timing_per_token_ms/gen:0.03199140678406354 - timing_per_token_ms/ref:0.01047383905721122 - perf/total_num_tokens:2016018 - perf/time_per_step:120.1777337808162 - perf/throughput:4193.825962130545 - frontier/active_count:45.0 - frontier/completed_count:19.0 - frontier/blacklisted_count:1424.0 - frontier/mean_score:2.7796393972341082 - frontier/mean_frontier_pct:0.5690966531401822 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.706999078032056 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:3.447912380357299 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.3867359265699994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.8934177692715093 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:176.0 - frontier/cluster_10/score:4.923887332396999 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9423519899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.7664883929999995 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:3.0148155385699993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9344167990715095 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.398116754277165 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.529439312569999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.662014568369999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:144.0 - frontier/cluster_23/score:2.7867582457776683 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:128.0 - frontier/cluster_26/score:2.419777411858999 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.4127936149144764 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:144.0 - frontier/cluster_30/score:1.6734044286470897 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:3.5557222929592993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.6114938088135096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.5442867325699994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.8319299999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.981925043505453 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.1300671077070894 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:128.0 - frontier/cluster_41/score:3.017694560320133 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.4370490890999994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:176.0 - frontier/cluster_44/score:2.0125400489185563 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.6412562999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:3.132107512943039 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:112.0 - frontier/cluster_47/score:2.6760944384900562 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.2517504950999996 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.7211432897724763 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:2.5283500099999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:80.0 - frontier/cluster_51/score:2.450910475099999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.509637332569999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:3.029639312569999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.798291989999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:112.0 - frontier/cluster_57/score:2.2853167127989993 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.7242867325699995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:340.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.02164148886623101 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02756482556524865 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.016382665456618265 - cluster/prob_snapshot/cluster_6:0.011086457457194523 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.015137197461700828 - cluster/prob_snapshot/cluster_10:0.039364717094810805 - cluster/prob_snapshot/cluster_11:0.023325533224067482 - cluster/prob_snapshot/cluster_12:0.02352305117089648 - cluster/prob_snapshot/cluster_13:0.02292759431596044 - cluster/prob_snapshot/cluster_14:0.022117084649763526 - cluster/prob_snapshot/cluster_15:0.02410237130894592 - cluster/prob_snapshot/cluster_16:0.01703495087800283 - cluster/prob_snapshot/cluster_17:0.023459612159215996 - cluster/prob_snapshot/cluster_18:0.035161369481983326 - cluster/prob_snapshot/cluster_19:0.02821660421197866 - cluster/prob_snapshot/cluster_20:0.023758097527622982 - cluster/prob_snapshot/cluster_21:0.021281853810236825 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02227913486868232 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.016485511690248353 - cluster/prob_snapshot/cluster_26:0.019345254434136785 - cluster/prob_snapshot/cluster_27:0.019289421476879634 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013378269540304143 - cluster/prob_snapshot/cluster_31:0.02842672723421475 - cluster/prob_snapshot/cluster_32:0.012883316290891171 - cluster/prob_snapshot/cluster_33:0.02833530402138379 - cluster/prob_snapshot/cluster_34:0.02264026687792607 - cluster/prob_snapshot/cluster_35:0.023839423571534176 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01702912423201866 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.023325533224067482 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.024125388057513287 - cluster/prob_snapshot/cluster_42:0.01948333531260682 - cluster/prob_snapshot/cluster_43:0.01915487983948257 - cluster/prob_snapshot/cluster_44:0.016089537456798194 - cluster/prob_snapshot/cluster_45:0.021115898883448236 - cluster/prob_snapshot/cluster_46:0.025040078668394947 - cluster/prob_snapshot/cluster_47:0.0213944173330374 - cluster/prob_snapshot/cluster_48:0.018001939366992178 - cluster/prob_snapshot/cluster_49:0.02974920890398162 - cluster/prob_snapshot/cluster_50:0.020213253501042416 - cluster/prob_snapshot/cluster_51:0.01959415213305716 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.028058294468478167 - cluster/prob_snapshot/cluster_54:0.024220882077043315 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.022371343026120993 - cluster/prob_snapshot/cluster_57:0.018270289265043302 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.029774339604193636 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  43%|████▎     | 341/800 [11:42:19<16:25:07, 128.78s/it]
[36m(TaskRunner pid=2823680)[0m step:341 - global_seqlen/min:434051 - global_seqlen/max:595626 - global_seqlen/minmax_diff:161575 - global_seqlen/balanced_min:492991 - global_seqlen/balanced_max:493151 - global_seqlen/mean:493076.25 - frontier/skipped_zero_acc_count:29.0 - actor/entropy:np.float64(0.1647903119213879) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.016650229692459106 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.10170029255095869) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011648942986357724) - actor/ppo_kl:np.float64(0.0005713587536627074) - actor/pg_clipfrac_lower:np.float64(5.733540736400755e-05) - actor/grad_norm:np.float64(0.4416458388933769) - perf/mfu/actor:np.float64(0.25722393582840053) - perf/max_memory_allocated_gb:np.float64(112.75414419174194) - perf/max_memory_reserved_gb:np.float64(119.345703125) - perf/cpu_memory_used_gb:np.float64(118.85884857177734) - actor/lr:np.float64(1e-06) - training/global_step:341 - training/epoch:0 - critic/score/mean:0.7070707082748413 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.7255848050117493 - critic/rewards/max:1.4599138498306274 - critic/rewards/min:-0.08028358966112137 - critic/advantages/mean:-0.13293200731277466 - critic/advantages/max:2.474720001220703 - critic/advantages/min:-2.4748003482818604 - critic/returns/mean:-0.13293200731277466 - critic/returns/max:2.474720001220703 - critic/returns/min:-2.4748003482818604 - response_length/mean:1300.035400390625 - response_length/max:8192.0 - response_length/min:73.0 - response_length/clip_ratio:0.049242425709962845 - response_length_non_aborted/mean:1300.035400390625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:73.0 - response_length_non_aborted/clip_ratio:0.049242425709962845 - response/aborted_ratio:0.0 - prompt_length/mean:233.1212158203125 - prompt_length/max:386.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.833167910575867e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.7392237447202206) - timing_s/agent_loop/generate_sequences/max:np.float64(38.548243443481624) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.711800547400344) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(38.548243443481624) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:184 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.84875242598355 - timing_s/reward:0.00016535818576812744 - timing_s/old_log_prob:10.63873546384275 - timing_s/ref:20.583702424541116 - timing_s/adv:0.13773384131491184 - timing_s/update_actor:23.546721629798412 - timing_s/update_weights:27.563534011133015 - timing_s/step:124.28579518944025 - timing_s/stop_profile:6.23892992734909e-05 - timing_per_token_ms/adv:0.0001134302713709682 - timing_per_token_ms/update_actor:0.019391828463260267 - timing_per_token_ms/gen:0.039673311551340434 - timing_per_token_ms/ref:0.0169516433255984 - perf/total_num_tokens:1972305 - perf/time_per_step:124.28579518944025 - perf/throughput:3967.277589916353 - frontier/active_count:45.0 - frontier/completed_count:19.0 - frontier/blacklisted_count:1453.0 - frontier/mean_score:2.798325786632453 - frontier/mean_frontier_pct:0.5831698091241523 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:128.0 - frontier/cluster_1/score:2.706999078032056 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:128.0 - frontier/cluster_3/score:3.313538666250109 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.3867359265699994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.8934177692715093 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:192.0 - frontier/cluster_10/score:4.946721132677899 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9423519899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.836541875099999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:3.0148155385699993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9344167990715095 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.398116754277165 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.370607518798999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.662014568369999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:160.0 - frontier/cluster_23/score:2.8507307720443675 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:48.0 - frontier/cluster_25/score:2.06207 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:128.0 - frontier/cluster_26/score:2.5938441883012993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.4127936149144764 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:3.5557222929592993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.6114938088135096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.5442867325699994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:32.0 - frontier/cluster_34/score:2.8319299999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.981925043505453 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.1300671077070894 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:3.612386192224093 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:96.0 - frontier/cluster_42/score:2.4370490890999994 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:192.0 - frontier/cluster_44/score:2.308778034242989 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:64.0 - frontier/cluster_45/score:2.6412562999999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:128.0 - frontier/cluster_46/score:3.132107512943039 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.3732661069430394 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.4762253465699997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:144.0 - frontier/cluster_49/score:3.5048003028407333 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:2.5283500099999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.615637332569999 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:96.0 - frontier/cluster_53/score:3.356746132798999 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:80.0 - frontier/cluster_54/score:3.029639312569999 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.798291989999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.8997216989592995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:96.0 - frontier/cluster_59/score:3.5070007127989995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:341.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.021496973424159842 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.02631365973722032 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.016273266877093472 - cluster/prob_snapshot/cluster_6:0.01101242538341564 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.015036115747941767 - cluster/prob_snapshot/cluster_10:0.03928318025258204 - cluster/prob_snapshot/cluster_11:0.023169772233395454 - cluster/prob_snapshot/cluster_12:0.023365971213974972 - cluster/prob_snapshot/cluster_13:0.022774490643256592 - cluster/prob_snapshot/cluster_14:0.02252570597470263 - cluster/prob_snapshot/cluster_15:0.023941422824014697 - cluster/prob_snapshot/cluster_16:0.016921196529953487 - cluster/prob_snapshot/cluster_17:0.02330295582919346 - cluster/prob_snapshot/cluster_18:0.034926572288226224 - cluster/prob_snapshot/cluster_19:0.026766858120827693 - cluster/prob_snapshot/cluster_20:0.023599448000864595 - cluster/prob_snapshot/cluster_21:0.02113973990437338 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02263838364164627 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.016375426334087707 - cluster/prob_snapshot/cluster_26:0.02059838144564901 - cluster/prob_snapshot/cluster_27:0.019160612443025308 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.011684630281289625 - cluster/prob_snapshot/cluster_31:0.028236901983360624 - cluster/prob_snapshot/cluster_32:0.012797285326911332 - cluster/prob_snapshot/cluster_33:0.02814608926762159 - cluster/prob_snapshot/cluster_34:0.022489081892609366 - cluster/prob_snapshot/cluster_35:0.02368023097358229 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01691540879258256 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.023169772233395454 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.028686884529158196 - cluster/prob_snapshot/cluster_42:0.01935323137968769 - cluster/prob_snapshot/cluster_43:0.01902696923476709 - cluster/prob_snapshot/cluster_44:0.01833459805996203 - cluster/prob_snapshot/cluster_45:0.020974893175350524 - cluster/prob_snapshot/cluster_46:0.024872868451915504 - cluster/prob_snapshot/cluster_47:0.02678797065061835 - cluster/prob_snapshot/cluster_48:0.019664340080287206 - cluster/prob_snapshot/cluster_49:0.02783251740962078 - cluster/prob_snapshot/cluster_50:0.02007827539101239 - cluster/prob_snapshot/cluster_51:0.020771446389399824 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.026656781302227282 - cluster/prob_snapshot/cluster_54:0.024059142212362408 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.022221953834502556 - cluster/prob_snapshot/cluster_57:0.01508617686915379 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.02784999142901701 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 23:14:42,654:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  43%|████▎     | 342/800 [11:44:22<16:09:43, 127.04s/it]
[36m(RewardLoopWorker pid=2826760)[0m WARNING:2026-04-12 23:14:42,869:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:342 - global_seqlen/min:423811 - global_seqlen/max:548847 - global_seqlen/minmax_diff:125036 - global_seqlen/balanced_min:498316 - global_seqlen/balanced_max:498538 - global_seqlen/mean:498434.25 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.15243743530188042) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.010333682410418987 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.040719164768233895) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0014156669374362817) - actor/ppo_kl:np.float64(-0.00041282528750540535) - actor/pg_clipfrac_lower:np.float64(0.00015011715527180058) - actor/grad_norm:np.float64(0.4351793263967221) - perf/mfu/actor:np.float64(0.2130725380715697) - perf/max_memory_allocated_gb:np.float64(112.75414419174194) - perf/max_memory_reserved_gb:np.float64(119.345703125) - perf/cpu_memory_used_gb:np.float64(119.10433197021484) - actor/lr:np.float64(1e-06) - training/global_step:342 - training/epoch:0 - critic/score/mean:0.6634615659713745 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6966456174850464 - critic/rewards/max:1.7128558158874512 - critic/rewards/min:-0.25695523619651794 - critic/advantages/mean:-0.1302362084388733 - critic/advantages/max:2.464151620864868 - critic/advantages/min:-2.4740424156188965 - critic/returns/mean:-0.1302362084388733 - critic/returns/max:2.464151620864868 - critic/returns/min:-2.4740424156188965 - response_length/mean:1536.409912109375 - response_length/max:8192.0 - response_length/min:99.0 - response_length/clip_ratio:0.0625 - response_length_non_aborted/mean:1536.409912109375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:99.0 - response_length_non_aborted/clip_ratio:0.0625 - response/aborted_ratio:0.0 - prompt_length/mean:233.0576934814453 - prompt_length/max:452.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.641090244054794e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8620742745697498) - timing_s/agent_loop/generate_sequences/max:np.float64(38.528829340822995) - timing_s/agent_loop/generate_sequences/mean:np.float64(8.975517627006411) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(38.528829340822995) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:199 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:40.73798031453043 - timing_s/reward:0.00014295242726802826 - timing_s/old_log_prob:11.676780772395432 - timing_s/ref:16.618399472907186 - timing_s/adv:0.08909493405371904 - timing_s/update_actor:28.35983585100621 - timing_s/update_weights:24.03065158985555 - timing_s/step:121.97293341998011 - timing_s/stop_profile:6.0663558542728424e-05 - timing_per_token_ms/adv:6.0518350501813987e-05 - timing_per_token_ms/update_actor:0.019263614754687186 - timing_per_token_ms/gen:0.03186904748326904 - timing_per_token_ms/ref:0.01128816284295321 - perf/total_num_tokens:1993737 - perf/time_per_step:121.97293341998011 - perf/throughput:4086.43324403603 - frontier/active_count:43.0 - frontier/completed_count:21.0 - frontier/blacklisted_count:1477.0 - frontier/mean_score:2.8310621033636303 - frontier/mean_frontier_pct:0.5888188568807061 - frontier/batch_easy_count:5.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:144.0 - frontier/cluster_3/score:3.819477066375076 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.3867359265699994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.8934177692715093 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.962704792874529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:48.0 - frontier/cluster_11/score:2.9176456999999996 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9423519899999997 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.86787 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:80.0 - frontier/cluster_14/score:2.836541875099999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:96.0 - frontier/cluster_15/score:3.010370876998999 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9344167990715095 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.398116754277165 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.370607518798999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9717524750999993 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.662014568369999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.295511540431057 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:64.0 - frontier/cluster_25/score:2.3434489999999997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:128.0 - frontier/cluster_26/score:2.5938441883012993 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.4127936149144764 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:3.5557222929592993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.6114938088135096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:96.0 - frontier/cluster_33/score:3.5442867325699994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8823509999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.981925043505453 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.1300671077070894 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:3.612386192224093 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:2.0059343623699992 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:96.0 - frontier/cluster_43/score:2.3959646392999994 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.7488794099999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:3.092475259060127 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.3732661069430394 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.4762253465699997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.953360211988513 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:2.5283500099999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.7309461327989992 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.2497222929592993 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:3.0207475187989994 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:64.0 - frontier/cluster_56/score:2.798291989999999 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.8997216989592995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.9549004989592995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:342.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.03137516762691284 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01683323875070342 - cluster/prob_snapshot/cluster_6:0.011391368869165345 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.015553516585203452 - cluster/prob_snapshot/cluster_10:0.04076623371562657 - cluster/prob_snapshot/cluster_11:0.023967056568903602 - cluster/prob_snapshot/cluster_12:0.024170006862024435 - cluster/prob_snapshot/cluster_13:0.023558173126456573 - cluster/prob_snapshot/cluster_14:0.023300827643529705 - cluster/prob_snapshot/cluster_15:0.024728749314015395 - cluster/prob_snapshot/cluster_16:0.017503464011717578 - cluster/prob_snapshot/cluster_17:0.02410482308392959 - cluster/prob_snapshot/cluster_18:0.03612841444264584 - cluster/prob_snapshot/cluster_19:0.027687920118137522 - cluster/prob_snapshot/cluster_20:0.024411517710838222 - cluster/prob_snapshot/cluster_21:0.021867169734614892 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.018856523581352664 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.019250306762517662 - cluster/prob_snapshot/cluster_26:0.021307182839896938 - cluster/prob_snapshot/cluster_27:0.01981993943190041 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012086704699442468 - cluster/prob_snapshot/cluster_31:0.029208548981347244 - cluster/prob_snapshot/cluster_32:0.013237646804158338 - cluster/prob_snapshot/cluster_33:0.02911461135117252 - cluster/prob_snapshot/cluster_34:0.023677127578730982 - cluster/prob_snapshot/cluster_35:0.024495080469134938 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01749747711518801 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.023967056568903602 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.029674015668784297 - cluster/prob_snapshot/cluster_42:0.016477786297503277 - cluster/prob_snapshot/cluster_43:0.01968169748890203 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.022580722642425142 - cluster/prob_snapshot/cluster_46:0.025403197335382047 - cluster/prob_snapshot/cluster_47:0.027709759141443753 - cluster/prob_snapshot/cluster_48:0.020341000608331612 - cluster/prob_snapshot/cluster_49:0.03247495329469971 - cluster/prob_snapshot/cluster_50:0.020769179655932172 - cluster/prob_snapshot/cluster_51:0.022433409392861564 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02669490611165799 - cluster/prob_snapshot/cluster_54:0.024813988437126146 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.022986623228666805 - cluster/prob_snapshot/cluster_57:0.015605300336545728 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.032487606011567007 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  43%|████▎     | 343/800 [11:47:00<17:17:57, 136.27s/it]
[36m(TaskRunner pid=2823680)[0m step:343 - global_seqlen/min:497761 - global_seqlen/max:732056 - global_seqlen/minmax_diff:234295 - global_seqlen/balanced_min:634467 - global_seqlen/balanced_max:634536 - global_seqlen/mean:634512.25 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.14206739336562654) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0006328369490802288 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.052159361279336736) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011107406720990791) - actor/ppo_kl:np.float64(0.00011516259142870429) - actor/pg_clipfrac_lower:np.float64(1.1972151772473202e-05) - actor/grad_norm:np.float64(0.8440506507953008) - perf/mfu/actor:np.float64(0.302116973909) - perf/max_memory_allocated_gb:np.float64(115.3043327331543) - perf/max_memory_reserved_gb:np.float64(121.76953125) - perf/cpu_memory_used_gb:np.float64(119.16627883911133) - actor/lr:np.float64(1e-06) - training/global_step:343 - training/epoch:0 - critic/score/mean:0.6026315689086914 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6749994158744812 - critic/rewards/max:2.2076494693756104 - critic/rewards/min:-0.06962195038795471 - critic/advantages/mean:-0.010182286612689495 - critic/advantages/max:2.474498748779297 - critic/advantages/min:-2.47478985786438 - critic/returns/mean:-0.010182286612689495 - critic/returns/max:2.474498748779297 - critic/returns/min:-2.47478985786438 - response_length/mean:1960.0289306640625 - response_length/max:8192.0 - response_length/min:181.0 - response_length/clip_ratio:0.10657894611358643 - response_length_non_aborted/mean:1960.0289306640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:181.0 - response_length_non_aborted/clip_ratio:0.10657894611358643 - response/aborted_ratio:0.0 - prompt_length/mean:240.50526428222656 - prompt_length/max:543.0 - prompt_length/min:182.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.013565093278885e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4398506851866841) - timing_s/agent_loop/generate_sequences/max:np.float64(45.15330175217241) - timing_s/agent_loop/generate_sequences/mean:np.float64(12.491856634769647) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.15330175217241) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:184 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.24625443853438 - timing_s/reward:0.0001368112862110138 - timing_s/old_log_prob:11.64666137099266 - timing_s/ref:33.887066127732396 - timing_s/adv:0.08371186256408691 - timing_s/update_actor:25.956313882023096 - timing_s/update_weights:37.452989174984396 - timing_s/step:156.6798632182181 - timing_s/stop_profile:5.263369530439377e-05 - timing_per_token_ms/adv:5.005474900477929e-05 - timing_per_token_ms/update_actor:0.01552034247785711 - timing_per_token_ms/gen:0.0317169419077688 - timing_per_token_ms/ref:0.020262463856104557 - perf/total_num_tokens:2538049 - perf/time_per_step:156.6798632182181 - perf/throughput:4049.7370687404427 - frontier/active_count:42.0 - frontier/completed_count:22.0 - frontier/blacklisted_count:1510.0 - frontier/mean_score:2.778162986384386 - frontier/mean_frontier_pct:0.5956991230401678 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.3867359265699994 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:128.0 - frontier/cluster_9/score:1.8934177692715093 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.962704792874529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.9423519899999997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9596463929999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9075089999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.885579312569999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:3.0072596138992993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9344167990715095 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.398116754277165 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:96.0 - frontier/cluster_19/score:3.370607518798999 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9802267325699994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.662014568369999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:176.0 - frontier/cluster_23/score:2.295511540431057 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:80.0 - frontier/cluster_25/score:1.9404142999999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.7156909318109093 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:128.0 - frontier/cluster_27/score:2.4127936149144764 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:112.0 - frontier/cluster_31/score:3.5557222929592993 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.6114938088135096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:3.381000712798999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8823509999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:160.0 - frontier/cluster_35/score:2.981925043505453 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.1300671077070894 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:48.0 - frontier/cluster_39/score:2.9176456999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:3.4286703345568648 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:112.0 - frontier/cluster_42/score:2.3041540536589995 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:1.9771752475099995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.7488794099999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:3.092475259060127 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.3732661069430394 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.4762253465699997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.953360211988513 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:2.5283500099999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:96.0 - frontier/cluster_51/score:2.7309461327989992 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:112.0 - frontier/cluster_53/score:3.1748056050715094 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:96.0 - frontier/cluster_54/score:3.0145232631592993 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.2588043929999992 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:1.8997216989592995 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.9549004989592995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:343.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.017562184034442783 - cluster/prob_snapshot/cluster_6:0.011884659835656656 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.016227044878139488 - cluster/prob_snapshot/cluster_10:0.042531571583336716 - cluster/prob_snapshot/cluster_11:0.025216662991064634 - cluster/prob_snapshot/cluster_12:0.025364880177031788 - cluster/prob_snapshot/cluster_13:0.024918050201222643 - cluster/prob_snapshot/cluster_14:0.024730107514793168 - cluster/prob_snapshot/cluster_15:0.025772936911717958 - cluster/prob_snapshot/cluster_16:0.01826143267891242 - cluster/prob_snapshot/cluster_17:0.025148656499627324 - cluster/prob_snapshot/cluster_18:0.03769291653918072 - cluster/prob_snapshot/cluster_19:0.028886915693829996 - cluster/prob_snapshot/cluster_20:0.025541258628332703 - cluster/prob_snapshot/cluster_21:0.022814103980771744 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.019673085036691892 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.016629816429999115 - cluster/prob_snapshot/cluster_26:0.02327412330275481 - cluster/prob_snapshot/cluster_27:0.02067822057356603 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012610106435560587 - cluster/prob_snapshot/cluster_31:0.030473393753060202 - cluster/prob_snapshot/cluster_32:0.013810888849174507 - cluster/prob_snapshot/cluster_33:0.028975987861738352 - cluster/prob_snapshot/cluster_34:0.024702440100974506 - cluster/prob_snapshot/cluster_35:0.025555813560801326 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01825518652627331 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0250049241539687 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.029384527965323856 - cluster/prob_snapshot/cluster_42:0.019747153450058317 - cluster/prob_snapshot/cluster_43:0.016944866576189096 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.023558556529141364 - cluster/prob_snapshot/cluster_46:0.026503255450387 - cluster/prob_snapshot/cluster_47:0.028909700432531604 - cluster/prob_snapshot/cluster_48:0.02122184574333947 - cluster/prob_snapshot/cluster_49:0.03388124618903879 - cluster/prob_snapshot/cluster_50:0.02166856662367744 - cluster/prob_snapshot/cluster_51:0.02340486403788271 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02720884628260649 - cluster/prob_snapshot/cluster_54:0.025835188129824164 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.01935849597009544 - cluster/prob_snapshot/cluster_57:0.016281071069090346 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03389444681818984 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  43%|████▎     | 344/800 [11:49:39<18:08:39, 143.24s/it]
[36m(TaskRunner pid=2823680)[0m step:344 - global_seqlen/min:421746 - global_seqlen/max:650943 - global_seqlen/minmax_diff:229197 - global_seqlen/balanced_min:579588 - global_seqlen/balanced_max:579856 - global_seqlen/mean:579785.25 - frontier/skipped_zero_acc_count:41.0 - actor/entropy:np.float64(0.1819954999002882) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009959866292774677 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.002985824306961149) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0014197180634338029) - actor/ppo_kl:np.float64(6.77268882637608e-05) - actor/pg_clipfrac_lower:np.float64(1.8054654591734025e-05) - actor/grad_norm:np.float64(0.5768950608643618) - perf/mfu/actor:np.float64(0.31319753507206927) - perf/max_memory_allocated_gb:np.float64(115.3043327331543) - perf/max_memory_reserved_gb:np.float64(121.76953125) - perf/cpu_memory_used_gb:np.float64(112.114501953125) - actor/lr:np.float64(1e-06) - training/global_step:344 - training/epoch:0 - critic/score/mean:0.6293103694915771 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6603432297706604 - critic/rewards/max:1.4623613357543945 - critic/rewards/min:-0.12214462459087372 - critic/advantages/mean:-0.06564672291278839 - critic/advantages/max:2.473952531814575 - critic/advantages/min:-2.474820375442505 - critic/returns/mean:-0.06564672291278839 - critic/returns/max:2.473952531814575 - critic/returns/min:-2.474820375442505 - response_length/mean:1633.1451416015625 - response_length/max:8192.0 - response_length/min:114.0 - response_length/clip_ratio:0.07471264153718948 - response_length_non_aborted/mean:1633.1451416015625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:114.0 - response_length_non_aborted/clip_ratio:0.07471264153718948 - response/aborted_ratio:0.0 - prompt_length/mean:248.5287322998047 - prompt_length/max:728.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.40136781334877e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.9623687099665403) - timing_s/agent_loop/generate_sequences/max:np.float64(43.12395949661732) - timing_s/agent_loop/generate_sequences/mean:np.float64(10.909664441088353) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(43.12395949661732) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:180 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:44.90761131141335 - timing_s/reward:0.0002696467563509941 - timing_s/old_log_prob:10.377005929127336 - timing_s/ref:41.89818078838289 - timing_s/adv:0.06926284357905388 - timing_s/update_actor:22.700734067708254 - timing_s/update_weights:37.924023574218154 - timing_s/step:158.27954118791968 - timing_s/stop_profile:6.483867764472961e-05 - timing_per_token_ms/adv:5.288673157920954e-05 - timing_per_token_ms/update_actor:0.017333501878530635 - timing_per_token_ms/gen:0.03950808134242541 - timing_per_token_ms/ref:0.03199201370476953 - perf/total_num_tokens:2319141 - perf/time_per_step:158.27954118791968 - perf/throughput:3663.04606172469 - frontier/active_count:41.0 - frontier/completed_count:23.0 - frontier/blacklisted_count:1551.0 - frontier/mean_score:2.7671032457270877 - frontier/mean_frontier_pct:0.6088248251462987 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6253924384900564 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.962704792874529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.9423519899999997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:64.0 - frontier/cluster_12/score:2.9596463929999994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:64.0 - frontier/cluster_13/score:2.9075089999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:96.0 - frontier/cluster_14/score:2.885579312569999 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:3.0072596138992993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9540917593500566 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.398116754277165 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:112.0 - frontier/cluster_19/score:3.859425263159299 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:80.0 - frontier/cluster_20/score:2.9802267325699994 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.763410197858999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.90685807830174 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:96.0 - frontier/cluster_25/score:1.6582900099999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.7156909318109093 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:144.0 - frontier/cluster_27/score:2.588955530440133 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:1.6114938088135096 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:3.381000712798999 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.8823509999999994 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.987347530453817 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.1300671077070894 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9423519899999997 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:3.4286703345568648 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:128.0 - frontier/cluster_42/score:2.5129078375612997 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:1.9771752475099995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.7488794099999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:3.092475259060127 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.3732661069430394 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.4762253465699997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.953360211988513 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:80.0 - frontier/cluster_50/score:2.5283500099999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.211662292959299 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:3.1223639235500564 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:3.0101662842115093 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.2588043929999992 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.2298051892715094 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.9549004989592995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:344.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.018062435722670796 - cluster/prob_snapshot/cluster_6:0.016489156599694566 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.01432679393989716 - cluster/prob_snapshot/cluster_10:0.043743066147214855 - cluster/prob_snapshot/cluster_11:0.025934949407379213 - cluster/prob_snapshot/cluster_12:0.02608738849976524 - cluster/prob_snapshot/cluster_13:0.025627830753349033 - cluster/prob_snapshot/cluster_14:0.025434534595734422 - cluster/prob_snapshot/cluster_15:0.026507068565013215 - cluster/prob_snapshot/cluster_16:0.018781602181132194 - cluster/prob_snapshot/cluster_17:0.026038427959633605 - cluster/prob_snapshot/cluster_18:0.03876658397689705 - cluster/prob_snapshot/cluster_19:0.03401836329636334 - cluster/prob_snapshot/cluster_20:0.026268791019704613 - cluster/prob_snapshot/cluster_21:0.02435769204938289 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.016807733390119643 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014616764975189921 - cluster/prob_snapshot/cluster_26:0.023937077264027287 - cluster/prob_snapshot/cluster_27:0.022819985823626585 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.012969300202164587 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.01420428641574022 - cluster/prob_snapshot/cluster_33:0.02980135712204681 - cluster/prob_snapshot/cluster_34:0.02540607908685625 - cluster/prob_snapshot/cluster_35:0.026331556295064135 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.01877517810936887 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.025934949407379213 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.030221534324761143 - cluster/prob_snapshot/cluster_42:0.022149674088639208 - cluster/prob_snapshot/cluster_43:0.017427534227029823 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.024229612455488853 - cluster/prob_snapshot/cluster_46:0.02725818993100697 - cluster/prob_snapshot/cluster_47:0.029733181522308205 - cluster/prob_snapshot/cluster_48:0.021826341410825904 - cluster/prob_snapshot/cluster_49:0.034846339743012204 - cluster/prob_snapshot/cluster_50:0.022285786954230693 - cluster/prob_snapshot/cluster_51:0.019494387438706042 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.027521639376904317 - cluster/prob_snapshot/cluster_54:0.026532689003270535 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.019909914875147518 - cluster/prob_snapshot/cluster_57:0.019654305456522973 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03485991638672987 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  43%|████▎     | 345/800 [11:52:12<18:28:10, 146.13s/it]
[36m(TaskRunner pid=2823680)[0m step:345 - global_seqlen/min:458609 - global_seqlen/max:629622 - global_seqlen/minmax_diff:171013 - global_seqlen/balanced_min:566188 - global_seqlen/balanced_max:566397 - global_seqlen/mean:566304.0 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.15507243259572515) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.006811903789639473 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.09774397367436904) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0014490104237897777) - actor/ppo_kl:np.float64(6.379491001365856e-05) - actor/pg_clipfrac_lower:np.float64(3.511391930135456e-05) - actor/grad_norm:np.float64(0.6693356541486887) - perf/mfu/actor:np.float64(0.26672562140156675) - perf/max_memory_allocated_gb:np.float64(115.3043327331543) - perf/max_memory_reserved_gb:np.float64(121.76953125) - perf/cpu_memory_used_gb:np.float64(136.58229064941406) - actor/lr:np.float64(1e-06) - training/global_step:345 - training/epoch:0 - critic/score/mean:0.6188119053840637 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6672936081886292 - critic/rewards/max:1.660997986793518 - critic/rewards/min:-0.2228793203830719 - critic/advantages/mean:-0.03287890553474426 - critic/advantages/max:2.4723060131073 - critic/advantages/min:-2.4747724533081055 - critic/returns/mean:-0.03287890553474426 - critic/returns/max:2.4723060131073 - critic/returns/min:-2.4747724533081055 - response_length/mean:1735.230224609375 - response_length/max:8192.0 - response_length/min:252.0 - response_length/clip_ratio:0.08292078971862793 - response_length_non_aborted/mean:1735.230224609375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:252.0 - response_length_non_aborted/clip_ratio:0.08292078971862793 - response/aborted_ratio:0.0 - prompt_length/mean:237.4257354736328 - prompt_length/max:730.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00012008659541606903 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.5989639144390821) - timing_s/agent_loop/generate_sequences/max:np.float64(39.848691870458424) - timing_s/agent_loop/generate_sequences/mean:np.float64(10.592220842027928) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(39.848691870458424) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:41.656187457963824 - timing_s/reward:0.00013157911598682404 - timing_s/old_log_prob:14.454920111224055 - timing_s/ref:32.67506965994835 - timing_s/adv:0.09954108949750662 - timing_s/update_actor:26.033645372837782 - timing_s/update_weights:35.967609165236354 - timing_s/step:151.312840430066 - timing_s/stop_profile:5.057733505964279e-05 - timing_per_token_ms/adv:6.2451041339644e-05 - timing_per_token_ms/update_actor:0.016333237576643657 - timing_per_token_ms/gen:0.029710575292435468 - timing_per_token_ms/ref:0.02049999790448643 - perf/total_num_tokens:2265216 - perf/time_per_step:151.312840430066 - perf/throughput:3742.603723454225 - frontier/active_count:40.0 - frontier/completed_count:24.0 - frontier/blacklisted_count:1578.0 - frontier/mean_score:2.771414572708248 - frontier/mean_frontier_pct:0.6161217236856098 - frontier/batch_easy_count:3.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6253924384900564 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:208.0 - frontier/cluster_10/score:4.962704792874529 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:64.0 - frontier/cluster_11/score:2.9423519899999997 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.9717524750999993 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:80.0 - frontier/cluster_13/score:3.5352562999999995 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.5199055187989994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:3.0072596138992993 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9540917593500566 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:4.398116754277165 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.9861587127989995 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.763410197858999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.90685807830174 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:96.0 - frontier/cluster_25/score:1.6582900099999998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.7156909318109093 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:144.0 - frontier/cluster_27/score:2.712268871308093 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:2.0280456661694566 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:3.2667004989592994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.9176456999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.987347530453817 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:144.0 - frontier/cluster_37/score:2.3910469753949624 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:144.0 - frontier/cluster_41/score:3.4286703345568648 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:2.0590354862929097 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:1.9771752475099995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.7488794099999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:3.0647326813420888 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.3732661069430394 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.4762253465699997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:160.0 - frontier/cluster_49/score:3.953360211988513 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:2.0698450069999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.211662292959299 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:3.0856547464850395 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:3.0101662842115093 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:80.0 - frontier/cluster_56/score:2.4811630750999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:128.0 - frontier/cluster_57/score:2.2298051892715094 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.9549004989592995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:345.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.018485195477891007 - cluster/prob_snapshot/cluster_6:0.016875093021277238 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.014662119252170475 - cluster/prob_snapshot/cluster_10:0.04476689306739965 - cluster/prob_snapshot/cluster_11:0.026541969027072614 - cluster/prob_snapshot/cluster_12:0.026807180928149446 - cluster/prob_snapshot/cluster_13:0.03189035966338049 - cluster/prob_snapshot/cluster_14:0.031751885422173776 - cluster/prob_snapshot/cluster_15:0.027127478901149946 - cluster/prob_snapshot/cluster_16:0.019221194363640164 - cluster/prob_snapshot/cluster_17:0.026647869543235597 - cluster/prob_snapshot/cluster_18:0.03967393400456947 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.026937134759677097 - cluster/prob_snapshot/cluster_21:0.024927795222987634 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.017201126250468757 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01495887719515296 - cluster/prob_snapshot/cluster_26:0.024497335751874853 - cluster/prob_snapshot/cluster_27:0.02446646649347776 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.013272852738656813 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.01829431877623811 - cluster/prob_snapshot/cluster_33:0.029467807984489437 - cluster/prob_snapshot/cluster_34:0.026319101883310565 - cluster/prob_snapshot/cluster_35:0.02694785868444213 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02156883166218626 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.026697976027706044 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.030928883469122606 - cluster/prob_snapshot/cluster_42:0.018573867534737002 - cluster/prob_snapshot/cluster_43:0.01783543381582468 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.02479671786630043 - cluster/prob_snapshot/cluster_46:0.027645924138545684 - cluster/prob_snapshot/cluster_47:0.03042910054094377 - cluster/prob_snapshot/cluster_48:0.022337197138916437 - cluster/prob_snapshot/cluster_49:0.035661934621037755 - cluster/prob_snapshot/cluster_50:0.018671376590342913 - cluster/prob_snapshot/cluster_51:0.01995066269351075 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.027834655060914558 - cluster/prob_snapshot/cluster_54:0.027153698997746407 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.02238173872950545 - cluster/prob_snapshot/cluster_57:0.02011432366732241 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03567582903245093 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  43%|████▎     | 346/800 [11:54:41<18:32:38, 147.04s/it]
[36m(TaskRunner pid=2823680)[0m step:346 - global_seqlen/min:379306 - global_seqlen/max:585823 - global_seqlen/minmax_diff:206517 - global_seqlen/balanced_min:527435 - global_seqlen/balanced_max:527517 - global_seqlen/mean:527475.25 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.16533745513023698) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00976878497749567 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04865377680835081) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00192292697765344) - actor/ppo_kl:np.float64(0.0011270812095077634) - actor/pg_clipfrac_lower:np.float64(4.516064464220179e-05) - actor/grad_norm:np.float64(0.5247033204023654) - perf/mfu/actor:np.float64(0.22911119014572834) - perf/max_memory_allocated_gb:np.float64(115.3043327331543) - perf/max_memory_reserved_gb:np.float64(121.76953125) - perf/cpu_memory_used_gb:np.float64(273.4072074890137) - actor/lr:np.float64(1e-06) - training/global_step:346 - training/epoch:0 - critic/score/mean:0.6349009871482849 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6743668913841248 - critic/rewards/max:2.2282280921936035 - critic/rewards/min:-0.30313077569007874 - critic/advantages/mean:-0.07773218303918839 - critic/advantages/max:2.474668502807617 - critic/advantages/min:-2.474792957305908 - critic/returns/mean:-0.07773218303918839 - critic/returns/max:2.474668502807617 - critic/returns/min:-2.474792957305908 - response_length/mean:1667.6868896484375 - response_length/max:8192.0 - response_length/min:126.0 - response_length/clip_ratio:0.08787129074335098 - response_length_non_aborted/mean:1667.6868896484375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:126.0 - response_length_non_aborted/clip_ratio:0.08787129074335098 - response/aborted_ratio:0.0 - prompt_length/mean:237.64356994628906 - prompt_length/max:480.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.906369864940643e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0983805153518915) - timing_s/agent_loop/generate_sequences/max:np.float64(40.50510066188872) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.548408245490464) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(40.50510066188872) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:181 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:42.77441290579736 - timing_s/reward:0.00014875270426273346 - timing_s/old_log_prob:11.789438053965569 - timing_s/ref:29.43322192877531 - timing_s/adv:0.1287503344938159 - timing_s/update_actor:28.815972461365163 - timing_s/update_weights:34.2312421137467 - timing_s/step:147.73744551371783 - timing_s/stop_profile:5.9382058680057526e-05 - timing_per_token_ms/adv:8.363088605236345e-05 - timing_per_token_ms/update_actor:0.018717662512327104 - timing_per_token_ms/gen:0.03174374664157116 - timing_per_token_ms/ref:0.019118602207573793 - perf/total_num_tokens:2109901 - perf/time_per_step:147.73744551371783 - perf/throughput:3570.355830682225 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1605.0 - frontier/mean_score:2.6823661334922915 - frontier/mean_frontier_pct:0.6136884147456252 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:128.0 - frontier/cluster_5/score:2.0492056050715095 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6253924384900564 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.3596463929999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:80.0 - frontier/cluster_12/score:2.9802267325699994 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.3639338631592994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:3.0050817297295094 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9540917593500566 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:192.0 - frontier/cluster_18/score:3.978681727994015 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.9903110989592996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.763410197858999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.90685807830174 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:96.0 - frontier/cluster_25/score:2.0608030069999996 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.7156909318109093 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:2.198588209915665 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:144.0 - frontier/cluster_32/score:2.0280456661694566 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:112.0 - frontier/cluster_33/score:3.2667004989592994 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:48.0 - frontier/cluster_34/score:2.9176456999999996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.987347530453817 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.5737328827764734 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:160.0 - frontier/cluster_41/score:3.300069234189805 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:2.0590354862929097 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:1.9771752475099995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.8242155869999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:144.0 - frontier/cluster_46/score:3.0647326813420888 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.3732661069430394 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.4762253465699997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:4.267352148391959 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:2.0698450069999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.211662292959299 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:3.0856547464850395 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:3.0101662842115093 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.6368141525699995 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.4608636324900566 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:112.0 - frontier/cluster_59/score:3.9549004989592995 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:346.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.02010406507076879 - cluster/prob_snapshot/cluster_6:0.01835295540048788 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.015946177029805338 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02314969494104103 - cluster/prob_snapshot/cluster_12:0.02923799935396972 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03300242060208361 - cluster/prob_snapshot/cluster_15:0.029481843348438556 - cluster/prob_snapshot/cluster_16:0.020904520197619444 - cluster/prob_snapshot/cluster_17:0.028981597946060145 - cluster/prob_snapshot/cluster_18:0.03903350457246052 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.029336933671534643 - cluster/prob_snapshot/cluster_21:0.027110885456047094 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.018707541494139893 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.020217843269718275 - cluster/prob_snapshot/cluster_26:0.026642727830777157 - cluster/prob_snapshot/cluster_27:0.021569607425716154 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014435243352002256 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.019896472046658592 - cluster/prob_snapshot/cluster_33:0.03204849685910305 - cluster/prob_snapshot/cluster_34:0.028624037949672627 - cluster/prob_snapshot/cluster_35:0.02930785910042843 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02525002528885331 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.029036092583429065 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.03237586626642024 - cluster/prob_snapshot/cluster_42:0.020200502720179796 - cluster/prob_snapshot/cluster_43:0.019397399525884724 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.027707426621520547 - cluster/prob_snapshot/cluster_46:0.03006705871666937 - cluster/prob_snapshot/cluster_47:0.03309397609842827 - cluster/prob_snapshot/cluster_48:0.024293411736785224 - cluster/prob_snapshot/cluster_49:0.041865552709223215 - cluster/prob_snapshot/cluster_50:0.020306551282189063 - cluster/prob_snapshot/cluster_51:0.02169787284505688 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03027231804155496 - cluster/prob_snapshot/cluster_54:0.02953172619763095 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.025868894351839068 - cluster/prob_snapshot/cluster_57:0.02414270314087986 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.03880019495492212 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  43%|████▎     | 347/800 [11:57:05<18:22:42, 146.05s/it]
[36m(TaskRunner pid=2823680)[0m step:347 - global_seqlen/min:375158 - global_seqlen/max:701989 - global_seqlen/minmax_diff:326831 - global_seqlen/balanced_min:543094 - global_seqlen/balanced_max:543467 - global_seqlen/mean:543209.75 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.16126318774574125) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.011275983415544033 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.021321701497072354) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0013378470576602315) - actor/ppo_kl:np.float64(0.0005172064250113332) - actor/pg_clipfrac_lower:np.float64(4.3102942678766944e-05) - actor/grad_norm:np.float64(0.9829108094175657) - perf/mfu/actor:np.float64(0.25523098424256374) - perf/max_memory_allocated_gb:np.float64(115.3043327331543) - perf/max_memory_reserved_gb:np.float64(121.76953125) - perf/cpu_memory_used_gb:np.float64(313.4658489227295) - actor/lr:np.float64(1e-06) - training/global_step:347 - training/epoch:0 - critic/score/mean:0.6666666865348816 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6964804530143738 - critic/rewards/max:1.433404564857483 - critic/rewards/min:-0.19992244243621826 - critic/advantages/mean:-0.14095456898212433 - critic/advantages/max:2.4705448150634766 - critic/advantages/min:-2.4748122692108154 - critic/returns/mean:-0.14095456898212433 - critic/returns/max:2.4705448150634766 - critic/returns/min:-2.4748122692108154 - response_length/mean:1575.8685302734375 - response_length/max:8192.0 - response_length/min:246.0 - response_length/clip_ratio:0.0716145858168602 - response_length_non_aborted/mean:1575.8685302734375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:246.0 - response_length_non_aborted/clip_ratio:0.0716145858168602 - response/aborted_ratio:0.0 - prompt_length/mean:241.3020782470703 - prompt_length/max:886.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010558124631643295 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.97918950766325) - timing_s/agent_loop/generate_sequences/max:np.float64(40.576203249394894) - timing_s/agent_loop/generate_sequences/mean:np.float64(10.372305953801515) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(40.576203249394894) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:42.76740459911525 - timing_s/reward:0.0001658918336033821 - timing_s/old_log_prob:12.300691730342805 - timing_s/ref:27.976592797785997 - timing_s/adv:0.13558911345899105 - timing_s/update_actor:26.476523641496897 - timing_s/update_weights:31.861689737066627 - timing_s/step:142.1547406092286 - timing_s/stop_profile:8.884165436029434e-05 - timing_per_token_ms/adv:9.715561513470034e-05 - timing_per_token_ms/update_actor:0.018971603806496404 - timing_per_token_ms/gen:0.03533716493890625 - timing_per_token_ms/ref:0.02004646990677471 - perf/total_num_tokens:2172839 - perf/time_per_step:142.1547406092286 - perf/throughput:3821.2566648989764 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1637.0 - frontier/mean_score:2.7239020493457353 - frontier/mean_frontier_pct:0.6348526612351695 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:144.0 - frontier/cluster_5/score:2.3344439235500563 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6253924384900564 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.3596463929999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.9861587127989995 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.3639338631592994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:112.0 - frontier/cluster_15/score:3.0050817297295094 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9540917593500566 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:208.0 - frontier/cluster_18/score:4.28507720959581 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:96.0 - frontier/cluster_20/score:2.9903110989592996 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:96.0 - frontier/cluster_21/score:2.763410197858999 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.90685807830174 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:112.0 - frontier/cluster_25/score:1.7425621048999997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.7156909318109093 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:160.0 - frontier/cluster_27/score:2.198588209915665 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:160.0 - frontier/cluster_32/score:2.319631966318619 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:3.7866903492715096 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:3.54235199 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.9911432713176715 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.701613017943531 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:64.0 - frontier/cluster_39/score:2.9596463929999994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.610048463932863 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:144.0 - frontier/cluster_42/score:2.0590354862929097 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:1.9771752475099995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:80.0 - frontier/cluster_45/score:2.8242155869999994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:3.045312876939462 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:128.0 - frontier/cluster_47/score:3.3732661069430394 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.4762253465699997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.887146503874371 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:2.0698450069999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.4481636050715094 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:3.0856547464850395 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:3.0101662842115093 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:96.0 - frontier/cluster_56/score:2.7457699067989996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.6226045427430393 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:128.0 - frontier/cluster_59/score:4.26843034927151 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:347.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.022553209960328936 - cluster/prob_snapshot/cluster_6:0.018073097021821225 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.015703018848896005 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02279669260700522 - cluster/prob_snapshot/cluster_12:0.028849467637759398 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03249917718868063 - cluster/prob_snapshot/cluster_15:0.029032284097650043 - cluster/prob_snapshot/cluster_16:0.020585753819036375 - cluster/prob_snapshot/cluster_17:0.02853966677828007 - cluster/prob_snapshot/cluster_18:0.041398401147828 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.028889584102312207 - cluster/prob_snapshot/cluster_21:0.026697480187937248 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.01842227614492807 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.016835002385046465 - cluster/prob_snapshot/cluster_26:0.026236461349370025 - cluster/prob_snapshot/cluster_27:0.021240699343561896 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014215124918104543 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.022410110707443046 - cluster/prob_snapshot/cluster_33:0.036583454260918094 - cluster/prob_snapshot/cluster_34:0.03422288596350854 - cluster/prob_snapshot/cluster_35:0.028897623771944967 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.02610045373571462 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.028593330444258973 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.025215842805165447 - cluster/prob_snapshot/cluster_42:0.019892471680156096 - cluster/prob_snapshot/cluster_43:0.01910161475099669 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.02728492488694335 - cluster/prob_snapshot/cluster_46:0.029420959747905606 - cluster/prob_snapshot/cluster_47:0.03258933658438614 - cluster/prob_snapshot/cluster_48:0.023922969229157407 - cluster/prob_snapshot/cluster_49:0.03755396750551136 - cluster/prob_snapshot/cluster_50:0.019996903141378256 - cluster/prob_snapshot/cluster_51:0.023651862974908438 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.029810705093628022 - cluster/prob_snapshot/cluster_54:0.029081406299143277 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.02652705622357298 - cluster/prob_snapshot/cluster_57:0.025337147874363128 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.041237574780449135 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▎     | 348/800 [11:59:52<19:07:12, 152.28s/it]
[36m(TaskRunner pid=2823680)[0m step:348 - global_seqlen/min:611045 - global_seqlen/max:705249 - global_seqlen/minmax_diff:94204 - global_seqlen/balanced_min:662851 - global_seqlen/balanced_max:663035 - global_seqlen/mean:662950.25 - frontier/skipped_zero_acc_count:30.0 - actor/entropy:np.float64(0.15113595048231737) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0015834822552278638 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.01940475939773023) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0019615641168475493) - actor/ppo_kl:np.float64(0.005631794130392466) - actor/pg_clipfrac_lower:np.float64(9.263392354225697e-05) - actor/grad_norm:np.float64(0.8614737013211617) - perf/mfu/actor:np.float64(0.2816489292326585) - perf/max_memory_allocated_gb:np.float64(115.3043327331543) - perf/max_memory_reserved_gb:np.float64(121.76953125) - perf/cpu_memory_used_gb:np.float64(668.5617523193359) - actor/lr:np.float64(1e-06) - training/global_step:348 - training/epoch:0 - critic/score/mean:0.5663265585899353 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6386559009552002 - critic/rewards/max:1.925902009010315 - critic/rewards/min:-0.08933950960636139 - critic/advantages/mean:0.00701728044077754 - critic/advantages/max:2.463252305984497 - critic/advantages/min:-2.4744369983673096 - critic/returns/mean:0.00701728044077754 - critic/returns/max:2.463252305984497 - critic/returns/min:-2.4744369983673096 - response_length/mean:1980.2462158203125 - response_length/max:8192.0 - response_length/min:137.0 - response_length/clip_ratio:0.11224489659070969 - response_length_non_aborted/mean:1980.2462158203125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:137.0 - response_length_non_aborted/clip_ratio:0.11224489659070969 - response/aborted_ratio:0.0 - prompt_length/mean:236.0408172607422 - prompt_length/max:804.0 - prompt_length/min:172.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010250788182020187 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.173767545260489) - timing_s/agent_loop/generate_sequences/max:np.float64(46.907200493849814) - timing_s/agent_loop/generate_sequences/mean:np.float64(13.419656847542683) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(46.907200493849814) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:48.83021157421172 - timing_s/reward:0.000211433507502079 - timing_s/old_log_prob:11.658411772921681 - timing_s/ref:35.431227904744446 - timing_s/adv:0.1005520885810256 - timing_s/update_actor:29.733116442337632 - timing_s/update_weights:38.76146337017417 - timing_s/step:165.10592437162995 - timing_s/stop_profile:5.3318217396736145e-05 - timing_per_token_ms/adv:5.7869407534909754e-05 - timing_per_token_ms/update_actor:0.017111905450855554 - timing_per_token_ms/gen:0.03145236888464813 - timing_per_token_ms/ref:0.020391263831677733 - perf/total_num_tokens:2651801 - perf/time_per_step:165.10592437162995 - perf/throughput:4015.302615718339 - frontier/active_count:38.0 - frontier/completed_count:26.0 - frontier/blacklisted_count:1667.0 - frontier/mean_score:2.701571038594563 - frontier/mean_frontier_pct:0.6608657506177983 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:1.9341107464850393 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:1.6253924384900564 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.3596463929999993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.9903110989592996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.3639338631592994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:3.0035572108106563 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9540917593500566 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:208.0 - frontier/cluster_18/score:4.28507720959581 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.9932177692715096 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.234387138501299 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:1.90685807830174 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:128.0 - frontier/cluster_25/score:1.5197934734299998 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:144.0 - frontier/cluster_26/score:2.8009836522676363 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:176.0 - frontier/cluster_27/score:1.8390117469409653 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:160.0 - frontier/cluster_32/score:2.523742376423033 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:3.7866903492715096 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:64.0 - frontier/cluster_34/score:3.379646393 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.9911432713176715 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.701613017943531 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9717524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.610048463932863 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:1.7413248404050368 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:1.9771752475099995 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.8769509108999993 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:3.045312876939462 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.8612862748601273 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:96.0 - frontier/cluster_48/score:2.4762253465699997 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:176.0 - frontier/cluster_49/score:3.887146503874371 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:96.0 - frontier/cluster_50/score:2.0698450069999996 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.4481636050715094 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:128.0 - frontier/cluster_53/score:3.0856547464850395 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:3.0101662842115093 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.8220389347592993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.6226045427430393 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:144.0 - frontier/cluster_59/score:4.487901244490057 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:348.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.01884001956497467 - cluster/prob_snapshot/cluster_6:0.01822248806804418 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.015832818982866564 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02298512858756083 - cluster/prob_snapshot/cluster_12:0.029128383528264468 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03276781327665937 - cluster/prob_snapshot/cluster_15:0.02925741285447701 - cluster/prob_snapshot/cluster_16:0.020755914323160318 - cluster/prob_snapshot/cluster_17:0.028775573810357304 - cluster/prob_snapshot/cluster_18:0.04174059764309189 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.02915669717351487 - cluster/prob_snapshot/cluster_21:0.021764988112286394 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.018574553483105147 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.01480418782956361 - cluster/prob_snapshot/cluster_26:0.027284159868197426 - cluster/prob_snapshot/cluster_27:0.017913667743975816 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.01433262621747463 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.02458357456271618 - cluster/prob_snapshot/cluster_33:0.03688585071792209 - cluster/prob_snapshot/cluster_34:0.03292086778512121 - cluster/prob_snapshot/cluster_35:0.029136489653284275 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.026316198391196987 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.028947605443429537 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.025424275324147056 - cluster/prob_snapshot/cluster_42:0.016962107325978325 - cluster/prob_snapshot/cluster_43:0.019259507457971805 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.028024150916555052 - cluster/prob_snapshot/cluster_46:0.02966415149043409 - cluster/prob_snapshot/cluster_47:0.037612483719734954 - cluster/prob_snapshot/cluster_48:0.02412071493912556 - cluster/prob_snapshot/cluster_49:0.037864386161965066 - cluster/prob_snapshot/cluster_50:0.020162196244043654 - cluster/prob_snapshot/cluster_51:0.023847367738185996 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.030057118445872187 - cluster/prob_snapshot/cluster_54:0.029321791314916627 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.027489257706989706 - cluster/prob_snapshot/cluster_57:0.02554658309316855 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.043716290499666514 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▎     | 349/800 [12:02:52<20:07:41, 160.67s/it]
[36m(TaskRunner pid=2823680)[0m step:349 - global_seqlen/min:501555 - global_seqlen/max:832372 - global_seqlen/minmax_diff:330817 - global_seqlen/balanced_min:606552 - global_seqlen/balanced_max:606689 - global_seqlen/mean:606596.25 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.1468132424245899) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.006180707365274429 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.01858712588000344) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0015575080556876249) - actor/ppo_kl:np.float64(-0.00043724644557959397) - actor/pg_clipfrac_lower:np.float64(1.7488404031003785e-05) - actor/grad_norm:np.float64(0.9053675532341003) - perf/mfu/actor:np.float64(0.2613435182806631) - perf/max_memory_allocated_gb:np.float64(115.3043327331543) - perf/max_memory_reserved_gb:np.float64(121.76953125) - perf/cpu_memory_used_gb:np.float64(678.641435623169) - actor/lr:np.float64(1e-06) - training/global_step:349 - training/epoch:0 - critic/score/mean:0.6157894730567932 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.7245258688926697 - critic/rewards/max:2.441981077194214 - critic/rewards/min:-0.18823668360710144 - critic/advantages/mean:0.12821367383003235 - critic/advantages/max:2.4722979068756104 - critic/advantages/min:-2.4739797115325928 - critic/returns/mean:0.12821367383003235 - critic/returns/max:2.4722979068756104 - critic/returns/min:-2.4739797115325928 - response_length/mean:1933.20654296875 - response_length/max:8192.0 - response_length/min:162.0 - response_length/clip_ratio:0.09868421405553818 - response_length_non_aborted/mean:1933.20654296875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:162.0 - response_length_non_aborted/clip_ratio:0.09868421405553818 - response/aborted_ratio:0.0 - prompt_length/mean:229.55789184570312 - prompt_length/max:495.0 - prompt_length/min:175.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.807051926851273e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1809265119954944) - timing_s/agent_loop/generate_sequences/max:np.float64(45.36171036027372) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.841182870893135) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.36171036027372) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:180 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.23337220773101 - timing_s/reward:0.00015939678996801376 - timing_s/old_log_prob:12.603691706433892 - timing_s/ref:40.89135464001447 - timing_s/adv:0.1253773495554924 - timing_s/update_actor:28.878117479383945 - timing_s/update_weights:48.317446171306074 - timing_s/step:178.63109925109893 - timing_s/stop_profile:5.342159420251846e-05 - timing_per_token_ms/adv:7.627746746853132e-05 - timing_per_token_ms/update_actor:0.01756896021805909 - timing_per_token_ms/gen:0.032148232182916035 - timing_per_token_ms/ref:0.02487761134173093 - perf/total_num_tokens:2426385 - perf/time_per_step:178.63109925109893 - perf/throughput:3395.804272285853 - frontier/active_count:37.0 - frontier/completed_count:27.0 - frontier/blacklisted_count:1700.0 - frontier/mean_score:2.670230247953923 - frontier/mean_frontier_pct:0.6705874907808207 - frontier/batch_easy_count:4.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:1.9341107464850393 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:144.0 - frontier/cluster_9/score:2.0377747069430394 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.5517524750999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.9903110989592996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:112.0 - frontier/cluster_14/score:3.3639338631592994 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:128.0 - frontier/cluster_15/score:3.0035572108106563 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:112.0 - frontier/cluster_16/score:2.1307959265699994 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9540917593500566 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.9932177692715096 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.234387138501299 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.2348006548112176 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3638554314009999 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:2.260688556587345 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:176.0 - frontier/cluster_27/score:1.8390117469409653 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:176.0 - frontier/cluster_32/score:2.066619663496123 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:3.7866903492715096 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.8657524751 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.9911432713176715 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:160.0 - frontier/cluster_37/score:2.701613017943531 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9717524750999993 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:176.0 - frontier/cluster_41/score:2.727033924753004 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:1.7413248404050368 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.2840226732569997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:96.0 - frontier/cluster_45/score:2.9138656376299994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:3.045312876939462 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.8612862748601273 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:2.6333577425989994 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:4.22100255271206 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:112.0 - frontier/cluster_50/score:1.7488915048999998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.4481636050715094 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.4599583225395274 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:3.0101662842115093 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.8220389347592993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:144.0 - frontier/cluster_57/score:2.6226045427430393 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:4.64153087114304 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:349.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.019576313113286475 - cluster/prob_snapshot/cluster_6:0.01893464764688206 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.020625559208515106 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02582784131205785 - cluster/prob_snapshot/cluster_12:0.030266760311295257 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03404842391640329 - cluster/prob_snapshot/cluster_15:0.03040083227879093 - cluster/prob_snapshot/cluster_16:0.02156708364030195 - cluster/prob_snapshot/cluster_17:0.029900162310516048 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.030296180492250423 - cluster/prob_snapshot/cluster_21:0.022615593403373904 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.022619778853858123 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.013804411673366865 - cluster/prob_snapshot/cluster_26:0.022881806078481375 - cluster/prob_snapshot/cluster_27:0.01861375820518805 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.014892764712973341 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.020917516585955115 - cluster/prob_snapshot/cluster_33:0.0383273998529394 - cluster/prob_snapshot/cluster_34:0.039127635792599716 - cluster/prob_snapshot/cluster_35:0.030275183234688512 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0273446711602781 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03007891717341826 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.027601971644353438 - cluster/prob_snapshot/cluster_42:0.017625009513888394 - cluster/prob_snapshot/cluster_43:0.023117984888292174 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.02949300922708681 - cluster/prob_snapshot/cluster_46:0.030823466812969956 - cluster/prob_snapshot/cluster_47:0.03908243065919085 - cluster/prob_snapshot/cluster_48:0.02665381793783132 - cluster/prob_snapshot/cluster_49:0.04272333824422601 - cluster/prob_snapshot/cluster_50:0.01770159633480637 - cluster/prob_snapshot/cluster_51:0.024779355252810907 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02489873677357247 - cluster/prob_snapshot/cluster_54:0.030467726736886944 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.02856357522708217 - cluster/prob_snapshot/cluster_57:0.026544978251307378 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.046979761538274635 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m local_global_step_folder: /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/checkpoints/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier/efficiency_qwen2_5_math_1_5b_16k_dapo_math_grpo_quarter_frontier-20260412.112633/global_step_350
[36m(WorkerDict pid=2825160)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.
[36m(WorkerDict pid=2825160)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")
[36m(TaskRunner pid=2823680)[0m test_gen_batch meta info: {'eos_token_id': 151643, 'pad_token_id': 151643, 'recompute_log_prob': False, 'do_sample': True, 'validate': True, 'global_steps': 350}
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 23:39:37,631:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(WorkerDict pid=2825159)[0m /storage/workspace/server-5/rl/jeremy/efficiency/verl/verl/utils/device.py:146: FutureWarning: torch.cuda._set_allocator_settings is deprecated. Use torch._C._accelerator_setAllocatorSettings instead.[32m [repeated 3x across cluster][0m
[36m(WorkerDict pid=2825159)[0m   torch.cuda.memory._set_allocator_settings(f"expandable_segments:{enable}")[32m [repeated 3x across cluster][0m
[36m(RewardLoopWorker pid=2826759)[0m WARNING:2026-04-12 23:39:42,853:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m validation generation end
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▍     | 350/800 [12:09:54<29:53:27, 239.13s/it]
[36m(TaskRunner pid=2823680)[0m step:350 - global_seqlen/min:534324 - global_seqlen/max:781797 - global_seqlen/minmax_diff:247473 - global_seqlen/balanced_min:657241 - global_seqlen/balanced_max:657326 - global_seqlen/mean:657296.25 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.16139033030560043) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.004497294779866934 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.05115309031680226) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012922822310467885) - actor/ppo_kl:np.float64(-2.7342625445910134e-05) - actor/pg_clipfrac_lower:np.float64(2.4249746391123676e-05) - actor/grad_norm:np.float64(0.508454995850722) - perf/mfu/actor:np.float64(0.27578260107436325) - perf/max_memory_allocated_gb:np.float64(122.3991756439209) - perf/max_memory_reserved_gb:np.float64(128.71875) - perf/cpu_memory_used_gb:np.float64(679.2377109527588) - actor/lr:np.float64(1e-06) - val-aux/aime2024/reward/mean@16:np.float64(0.0875) - val-aux/aime2024/reward/std@16:np.float64(0.1233210605235271) - val-aux/aime2024/reward/best@2/mean:np.float64(0.13576666666666667) - val-aux/aime2024/reward/best@2/std:np.float64(0.13510688311450086) - val-aux/aime2024/reward/worst@2/mean:np.float64(0.03766666666666667) - val-aux/aime2024/reward/worst@2/std:np.float64(0.07483145252271857) - val-aux/aime2024/reward/maj@2/mean:np.float64(0.08526666666666667) - val-aux/aime2024/reward/maj@2/std:np.float64(0.12349365629377203) - val-aux/aime2024/reward/best@4/mean:np.float64(0.1914666666666667) - val-aux/aime2024/reward/best@4/std:np.float64(0.1274170756803477) - val-aux/aime2024/reward/worst@4/mean:np.float64(0.0136) - val-aux/aime2024/reward/worst@4/std:np.float64(0.03448407724227123) - val-aux/aime2024/reward/maj@4/mean:np.float64(0.10193333333333335) - val-aux/aime2024/reward/maj@4/std:np.float64(0.1108778111076184) - val-aux/aime2024/reward/best@8/mean:np.float64(0.24556666666666663) - val-aux/aime2024/reward/best@8/std:np.float64(0.10710572824430921) - val-aux/aime2024/reward/worst@8/mean:np.float64(0.0037) - val-aux/aime2024/reward/worst@8/std:np.float64(0.011483223940059952) - val-aux/aime2024/reward/maj@8/mean:np.float64(0.11203333333333333) - val-aux/aime2024/reward/maj@8/std:np.float64(0.08576824316798112) - val-aux/aime2024/reward/best@16/mean:np.float64(0.29209999999999997) - val-aux/aime2024/reward/best@16/std:np.float64(0.07230124017789609) - val-aux/aime2024/reward/worst@16/mean:np.float64(0.0005) - val-aux/aime2024/reward/worst@16/std:np.float64(0.004051748593714407) - val-aux/aime2024/reward/maj@16/mean:np.float64(0.11670000000000001) - val-aux/aime2024/reward/maj@16/std:np.float64(0.05440276939165014) - val-aux/aime2024/score/mean@16:np.float64(0.0875) - val-aux/aime2024/score/std@16:np.float64(0.1233210605235271) - val-aux/aime2024/score/best@2/mean:np.float64(0.13576666666666667) - val-aux/aime2024/score/best@2/std:np.float64(0.13510688311450086) - val-aux/aime2024/score/worst@2/mean:np.float64(0.03766666666666667) - val-aux/aime2024/score/worst@2/std:np.float64(0.07483145252271857) - val-aux/aime2024/score/maj@2/mean:np.float64(0.08526666666666667) - val-aux/aime2024/score/maj@2/std:np.float64(0.12349365629377203) - val-aux/aime2024/score/best@4/mean:np.float64(0.1914666666666667) - val-aux/aime2024/score/best@4/std:np.float64(0.1274170756803477) - val-aux/aime2024/score/worst@4/mean:np.float64(0.0136) - val-aux/aime2024/score/worst@4/std:np.float64(0.03448407724227123) - val-aux/aime2024/score/maj@4/mean:np.float64(0.10193333333333335) - val-aux/aime2024/score/maj@4/std:np.float64(0.1108778111076184) - val-aux/aime2024/score/best@8/mean:np.float64(0.24556666666666663) - val-aux/aime2024/score/best@8/std:np.float64(0.10710572824430921) - val-aux/aime2024/score/worst@8/mean:np.float64(0.0037) - val-aux/aime2024/score/worst@8/std:np.float64(0.011483223940059952) - val-aux/aime2024/score/maj@8/mean:np.float64(0.11203333333333333) - val-aux/aime2024/score/maj@8/std:np.float64(0.08576824316798112) - val-aux/aime2024/score/best@16/mean:np.float64(0.29209999999999997) - val-aux/aime2024/score/best@16/std:np.float64(0.07230124017789609) - val-aux/aime2024/score/worst@16/mean:np.float64(0.0005) - val-aux/aime2024/score/worst@16/std:np.float64(0.004051748593714407) - val-aux/aime2024/score/maj@16/mean:np.float64(0.11670000000000001) - val-aux/aime2024/score/maj@16/std:np.float64(0.05440276939165014) - val-core/aime2024/acc/mean@16:np.float64(0.0875) - val-aux/aime2024/acc/std@16:np.float64(0.1233210605235271) - val-aux/aime2024/acc/best@2/mean:np.float64(0.13576666666666667) - val-aux/aime2024/acc/best@2/std:np.float64(0.13510688311450086) - val-aux/aime2024/acc/worst@2/mean:np.float64(0.03766666666666667) - val-aux/aime2024/acc/worst@2/std:np.float64(0.07483145252271857) - val-aux/aime2024/acc/maj@2/mean:np.float64(0.08526666666666667) - val-aux/aime2024/acc/maj@2/std:np.float64(0.12349365629377203) - val-aux/aime2024/acc/best@4/mean:np.float64(0.1914666666666667) - val-aux/aime2024/acc/best@4/std:np.float64(0.1274170756803477) - val-aux/aime2024/acc/worst@4/mean:np.float64(0.0136) - val-aux/aime2024/acc/worst@4/std:np.float64(0.03448407724227123) - val-aux/aime2024/acc/maj@4/mean:np.float64(0.10193333333333335) - val-aux/aime2024/acc/maj@4/std:np.float64(0.1108778111076184) - val-aux/aime2024/acc/best@8/mean:np.float64(0.24556666666666663) - val-aux/aime2024/acc/best@8/std:np.float64(0.10710572824430921) - val-aux/aime2024/acc/worst@8/mean:np.float64(0.0037) - val-aux/aime2024/acc/worst@8/std:np.float64(0.011483223940059952) - val-aux/aime2024/acc/maj@8/mean:np.float64(0.11203333333333333) - val-aux/aime2024/acc/maj@8/std:np.float64(0.08576824316798112) - val-core/aime2024/acc/best@16/mean:np.float64(0.29209999999999997) - val-core/aime2024/acc/best@16/std:np.float64(0.07230124017789609) - val-aux/aime2024/acc/worst@16/mean:np.float64(0.0005) - val-aux/aime2024/acc/worst@16/std:np.float64(0.004051748593714407) - val-core/aime2024/acc/maj@16/mean:np.float64(0.11670000000000001) - val-core/aime2024/acc/maj@16/std:np.float64(0.05440276939165014) - val-aux/aime2025/reward/mean@16:np.float64(0.06458333333333334) - val-aux/aime2025/reward/std@16:np.float64(0.07833010503893298) - val-aux/aime2025/reward/best@2/mean:np.float64(0.09956666666666666) - val-aux/aime2025/reward/best@2/std:np.float64(0.07730289784509704) - val-aux/aime2025/reward/worst@2/mean:np.float64(0.0264) - val-aux/aime2025/reward/worst@2/std:np.float64(0.057030833980974104) - val-aux/aime2025/reward/maj@2/mean:np.float64(0.06216666666666666) - val-aux/aime2025/reward/maj@2/std:np.float64(0.07747361127702383) - val-aux/aime2025/reward/best@4/mean:np.float64(0.1358666666666667) - val-aux/aime2025/reward/best@4/std:np.float64(0.05729602716011951) - val-aux/aime2025/reward/worst@4/mean:np.float64(0.005366666666666667) - val-aux/aime2025/reward/worst@4/std:np.float64(0.02608879987551415) - val-aux/aime2025/reward/maj@4/mean:np.float64(0.08363333333333335) - val-aux/aime2025/reward/maj@4/std:np.float64(0.0775921705680271) - val-aux/aime2025/reward/best@8/mean:np.float64(0.15786666666666666) - val-aux/aime2025/reward/best@8/std:np.float64(0.027543313337429605) - val-aux/aime2025/reward/worst@8/mean:np.float64(0.0003333333333333333) - val-aux/aime2025/reward/worst@8/std:np.float64(0.00467819578858231) - val-aux/aime2025/reward/maj@8/mean:np.float64(0.10706666666666666) - val-aux/aime2025/reward/maj@8/std:np.float64(0.07135639502164629) - val-aux/aime2025/reward/best@16/mean:np.float64(0.16553333333333334) - val-aux/aime2025/reward/best@16/std:np.float64(0.007355887193579246) - val-aux/aime2025/reward/worst@16/mean:np.float64(0.0) - val-aux/aime2025/reward/worst@16/std:np.float64(0.0) - val-aux/aime2025/reward/maj@16/mean:np.float64(0.1281) - val-aux/aime2025/reward/maj@16/std:np.float64(0.055512421073442335) - val-aux/aime2025/score/mean@16:np.float64(0.06458333333333334) - val-aux/aime2025/score/std@16:np.float64(0.07833010503893298) - val-aux/aime2025/score/best@2/mean:np.float64(0.09956666666666666) - val-aux/aime2025/score/best@2/std:np.float64(0.07730289784509704) - val-aux/aime2025/score/worst@2/mean:np.float64(0.0264) - val-aux/aime2025/score/worst@2/std:np.float64(0.057030833980974104) - val-aux/aime2025/score/maj@2/mean:np.float64(0.06216666666666666) - val-aux/aime2025/score/maj@2/std:np.float64(0.07747361127702383) - val-aux/aime2025/score/best@4/mean:np.float64(0.1358666666666667) - val-aux/aime2025/score/best@4/std:np.float64(0.05729602716011951) - val-aux/aime2025/score/worst@4/mean:np.float64(0.005366666666666667) - val-aux/aime2025/score/worst@4/std:np.float64(0.02608879987551415) - val-aux/aime2025/score/maj@4/mean:np.float64(0.08363333333333335) - val-aux/aime2025/score/maj@4/std:np.float64(0.0775921705680271) - val-aux/aime2025/score/best@8/mean:np.float64(0.15786666666666666) - val-aux/aime2025/score/best@8/std:np.float64(0.027543313337429605) - val-aux/aime2025/score/worst@8/mean:np.float64(0.0003333333333333333) - val-aux/aime2025/score/worst@8/std:np.float64(0.00467819578858231) - val-aux/aime2025/score/maj@8/mean:np.float64(0.10706666666666666) - val-aux/aime2025/score/maj@8/std:np.float64(0.07135639502164629) - val-aux/aime2025/score/best@16/mean:np.float64(0.16553333333333334) - val-aux/aime2025/score/best@16/std:np.float64(0.007355887193579246) - val-aux/aime2025/score/worst@16/mean:np.float64(0.0) - val-aux/aime2025/score/worst@16/std:np.float64(0.0) - val-aux/aime2025/score/maj@16/mean:np.float64(0.1281) - val-aux/aime2025/score/maj@16/std:np.float64(0.055512421073442335) - val-core/aime2025/acc/mean@16:np.float64(0.06458333333333334) - val-aux/aime2025/acc/std@16:np.float64(0.07833010503893298) - val-aux/aime2025/acc/best@2/mean:np.float64(0.09956666666666666) - val-aux/aime2025/acc/best@2/std:np.float64(0.07730289784509704) - val-aux/aime2025/acc/worst@2/mean:np.float64(0.0264) - val-aux/aime2025/acc/worst@2/std:np.float64(0.057030833980974104) - val-aux/aime2025/acc/maj@2/mean:np.float64(0.06216666666666666) - val-aux/aime2025/acc/maj@2/std:np.float64(0.07747361127702383) - val-aux/aime2025/acc/best@4/mean:np.float64(0.1358666666666667) - val-aux/aime2025/acc/best@4/std:np.float64(0.05729602716011951) - val-aux/aime2025/acc/worst@4/mean:np.float64(0.005366666666666667) - val-aux/aime2025/acc/worst@4/std:np.float64(0.02608879987551415) - val-aux/aime2025/acc/maj@4/mean:np.float64(0.08363333333333335) - val-aux/aime2025/acc/maj@4/std:np.float64(0.0775921705680271) - val-aux/aime2025/acc/best@8/mean:np.float64(0.15786666666666666) - val-aux/aime2025/acc/best@8/std:np.float64(0.027543313337429605) - val-aux/aime2025/acc/worst@8/mean:np.float64(0.0003333333333333333) - val-aux/aime2025/acc/worst@8/std:np.float64(0.00467819578858231) - val-aux/aime2025/acc/maj@8/mean:np.float64(0.10706666666666666) - val-aux/aime2025/acc/maj@8/std:np.float64(0.07135639502164629) - val-core/aime2025/acc/best@16/mean:np.float64(0.16553333333333334) - val-core/aime2025/acc/best@16/std:np.float64(0.007355887193579246) - val-aux/aime2025/acc/worst@16/mean:np.float64(0.0) - val-aux/aime2025/acc/worst@16/std:np.float64(0.0) - val-core/aime2025/acc/maj@16/mean:np.float64(0.1281) - val-core/aime2025/acc/maj@16/std:np.float64(0.055512421073442335) - val-aux/math500/reward/mean@4:np.float64(0.688) - val-aux/math500/reward/std@4:np.float64(0.13826279441628825) - val-aux/math500/reward/best@2/mean:np.float64(0.75049) - val-aux/math500/reward/best@2/std:np.float64(0.11420294898355356) - val-aux/math500/reward/worst@2/mean:np.float64(0.6259060000000001) - val-aux/math500/reward/worst@2/std:np.float64(0.12328296313358061) - val-aux/math500/reward/maj@2/mean:np.float64(0.68845) - val-aux/math500/reward/maj@2/std:np.float64(0.13817503613369944) - val-aux/math500/reward/best@4/mean:np.float64(0.79627) - val-aux/math500/reward/best@4/std:np.float64(0.07106154505293932) - val-aux/math500/reward/worst@4/mean:np.float64(0.5735779999999999) - val-aux/math500/reward/worst@4/std:np.float64(0.0862238260773842) - val-aux/math500/reward/maj@4/mean:np.float64(0.70205) - val-aux/math500/reward/maj@4/std:np.float64(0.12638017647408106) - val-aux/math500/score/mean@4:np.float64(0.688) - val-aux/math500/score/std@4:np.float64(0.13826279441628825) - val-aux/math500/score/best@2/mean:np.float64(0.75049) - val-aux/math500/score/best@2/std:np.float64(0.11420294898355356) - val-aux/math500/score/worst@2/mean:np.float64(0.6259060000000001) - val-aux/math500/score/worst@2/std:np.float64(0.12328296313358061) - val-aux/math500/score/maj@2/mean:np.float64(0.68845) - val-aux/math500/score/maj@2/std:np.float64(0.13817503613369944) - val-aux/math500/score/best@4/mean:np.float64(0.79627) - val-aux/math500/score/best@4/std:np.float64(0.07106154505293932) - val-aux/math500/score/worst@4/mean:np.float64(0.5735779999999999) - val-aux/math500/score/worst@4/std:np.float64(0.0862238260773842) - val-aux/math500/score/maj@4/mean:np.float64(0.70205) - val-aux/math500/score/maj@4/std:np.float64(0.12638017647408106) - val-core/math500/acc/mean@4:np.float64(0.688) - val-aux/math500/acc/std@4:np.float64(0.13826279441628825) - val-aux/math500/acc/best@2/mean:np.float64(0.75049) - val-aux/math500/acc/best@2/std:np.float64(0.11420294898355356) - val-aux/math500/acc/worst@2/mean:np.float64(0.6259060000000001) - val-aux/math500/acc/worst@2/std:np.float64(0.12328296313358061) - val-aux/math500/acc/maj@2/mean:np.float64(0.68845) - val-aux/math500/acc/maj@2/std:np.float64(0.13817503613369944) - val-core/math500/acc/best@4/mean:np.float64(0.79627) - val-core/math500/acc/best@4/std:np.float64(0.07106154505293932) - val-aux/math500/acc/worst@4/mean:np.float64(0.5735779999999999) - val-aux/math500/acc/worst@4/std:np.float64(0.0862238260773842) - val-core/math500/acc/maj@4/mean:np.float64(0.70205) - val-core/math500/acc/maj@4/std:np.float64(0.12638017647408106) - val-aux/num_turns/min:np.int32(2) - val-aux/num_turns/max:np.int32(2) - val-aux/num_turns/mean:np.float64(2.0) - val-aux/response_length/clip_ratio:0.1331081081081081 - val-aux/aime2024/response_length/clip_ratio:0.30625 - val-aux/aime2025/response_length/clip_ratio:0.20416666666666666 - val-aux/math500/response_length/clip_ratio:0.0745 - training/global_step:350 - training/epoch:0 - critic/score/mean:0.5279255509376526 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6172211766242981 - critic/rewards/max:1.879536509513855 - critic/rewards/min:-0.6852239966392517 - critic/advantages/mean:-0.018871430307626724 - critic/advantages/max:2.4720444679260254 - critic/advantages/min:-2.4747862815856934 - critic/returns/mean:-0.018871430307626724 - critic/returns/max:2.4720444679260254 - critic/returns/min:-2.4747862815856934 - response_length/mean:2047.43212890625 - response_length/max:8192.0 - response_length/min:171.0 - response_length/clip_ratio:0.11702127754688263 - response_length_non_aborted/mean:2047.43212890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:171.0 - response_length_non_aborted/clip_ratio:0.11702127754688263 - response/aborted_ratio:0.0 - prompt_length/mean:235.58511352539062 - prompt_length/max:543.0 - prompt_length/min:174.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00011106021702289581 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8161393124610186) - timing_s/agent_loop/generate_sequences/max:np.float64(47.018700894899666) - timing_s/agent_loop/generate_sequences/mean:np.float64(13.7223313212844) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(47.018700894899666) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:49.024770389311016 - timing_s/reward:0.00019187480211257935 - timing_s/old_log_prob:12.347833682782948 - timing_s/ref:42.06991957593709 - timing_s/adv:0.09776824899017811 - timing_s/update_actor:29.568609696812928 - timing_s/save_checkpoint:59.774950104765594 - timing_s/update_weights:49.45599665958434 - timing_s/step:242.84042533300817 - timing_s/testing:177.65289004798979 - timing_s/stop_profile:0.0004581129178404808 - timing_per_token_ms/adv:5.694699296795319e-05 - timing_per_token_ms/update_actor:0.017222804191222848 - timing_per_token_ms/gen:0.03184111025766643 - timing_per_token_ms/ref:0.02450443205231103 - perf/total_num_tokens:2629185 - perf/time_per_step:242.84042533300817 - perf/throughput:2706.700291348307 - frontier/active_count:36.0 - frontier/completed_count:28.0 - frontier/blacklisted_count:1734.0 - frontier/mean_score:2.6223239344581057 - frontier/mean_frontier_pct:0.6842009616918151 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:2.2538775225395273 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:1.7264422948601275 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.5517524750999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:96.0 - frontier/cluster_12/score:2.9903110989592996 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:3.2547537042115096 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:144.0 - frontier/cluster_15/score:2.402490047567459 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:128.0 - frontier/cluster_16/score:2.3915571485989995 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:128.0 - frontier/cluster_17/score:2.9540917593500566 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.9932177692715096 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.234387138501299 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:192.0 - frontier/cluster_23/score:2.2348006548112176 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3638554314009999 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:2.260688556587345 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:192.0 - frontier/cluster_27/score:1.5873082228586757 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.4713831000529627 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:176.0 - frontier/cluster_32/score:2.066619663496123 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:128.0 - frontier/cluster_33/score:3.5506832444900565 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.8657524751 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:176.0 - frontier/cluster_35/score:2.9911432713176715 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:2.1911291125604717 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9802267325699994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:1.7413248404050368 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.2840226732569997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.9397059463409994 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:160.0 - frontier/cluster_46/score:3.045312876939462 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.602900392402089 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:2.7433504198192993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:4.22100255271206 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:128.0 - frontier/cluster_50/score:1.5242240534299998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.4481636050715094 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.4599583225395274 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:112.0 - frontier/cluster_54/score:3.0071163989480563 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.8220389347592993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:2.7358231799201276 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:4.64153087114304 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:350.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.023874895140431604 - cluster/prob_snapshot/cluster_6:0.019816129121379426 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.018287874271601542 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.02703022775554605 - cluster/prob_snapshot/cluster_12:0.03167579569473701 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03447698582504425 - cluster/prob_snapshot/cluster_15:0.02544911930128967 - cluster/prob_snapshot/cluster_16:0.025333309185681076 - cluster/prob_snapshot/cluster_17:0.03129213113152151 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0317065854995147 - cluster/prob_snapshot/cluster_21:0.02366843721602978 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02367281751550898 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.014447060714602751 - cluster/prob_snapshot/cluster_26:0.023947043126319647 - cluster/prob_snapshot/cluster_27:0.016814053557620102 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.015586080820215285 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.021891308319863988 - cluster/prob_snapshot/cluster_33:0.03761171105853685 - cluster/prob_snapshot/cluster_34:0.04094917938481629 - cluster/prob_snapshot/cluster_35:0.03168461073796638 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.023210213189661367 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0315689739230593 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.01844552224082529 - cluster/prob_snapshot/cluster_43:0.024194216978097937 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.031139744955402185 - cluster/prob_snapshot/cluster_46:0.032258419048793126 - cluster/prob_snapshot/cluster_47:0.03816483735686716 - cluster/prob_snapshot/cluster_48:0.02905978827671464 - cluster/prob_snapshot/cluster_49:0.0447122757672948 - cluster/prob_snapshot/cluster_50:0.01614581497097594 - cluster/prob_snapshot/cluster_51:0.02593293059324977 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.026057869788012514 - cluster/prob_snapshot/cluster_54:0.031853811035421085 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.02989332072209437 - cluster/prob_snapshot/cluster_57:0.028980053658709704 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.049166852116594016 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826755)[0m WARNING:2026-04-12 23:42:20,507:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▍     | 351/800 [12:12:33<26:48:29, 214.94s/it]
[36m(TaskRunner pid=2823680)[0m step:351 - global_seqlen/min:489395 - global_seqlen/max:752330 - global_seqlen/minmax_diff:262935 - global_seqlen/balanced_min:592180 - global_seqlen/balanced_max:592241 - global_seqlen/mean:592219.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.15551512862245243) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.0026828618720173836 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.1538423991878517) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0016608586919675064) - actor/ppo_kl:np.float64(-0.0008909523143958924) - actor/pg_clipfrac_lower:np.float64(0.00010482920521705333) - actor/grad_norm:np.float64(0.7709906746943792) - perf/mfu/actor:np.float64(0.2720518095765746) - perf/max_memory_allocated_gb:np.float64(122.3991756439209) - perf/max_memory_reserved_gb:np.float64(128.71875) - perf/cpu_memory_used_gb:np.float64(678.0727958679199) - actor/lr:np.float64(1e-06) - training/global_step:351 - training/epoch:0 - critic/score/mean:0.591292142868042 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6701219081878662 - critic/rewards/max:2.012352705001831 - critic/rewards/min:-0.10350008308887482 - critic/advantages/mean:-0.06335792690515518 - critic/advantages/max:2.4707694053649902 - critic/advantages/min:-2.4744179248809814 - critic/returns/mean:-0.06335792690515518 - critic/returns/max:2.4707694053649902 - critic/returns/min:-2.4744179248809814 - response_length/mean:1903.9873046875 - response_length/max:8192.0 - response_length/min:165.0 - response_length/clip_ratio:0.10252808779478073 - response_length_non_aborted/mean:1903.9873046875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:165.0 - response_length_non_aborted/clip_ratio:0.10252808779478073 - response/aborted_ratio:0.0 - prompt_length/mean:245.30337524414062 - prompt_length/max:657.0 - prompt_length/min:171.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.917560964822769e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3434569323435426) - timing_s/agent_loop/generate_sequences/max:np.float64(43.22787109389901) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.299292865925054) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(43.22787109389901) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:200 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:45.71612011734396 - timing_s/reward:0.0002212347462773323 - timing_s/old_log_prob:11.768376969732344 - timing_s/ref:32.76507952902466 - timing_s/adv:0.08625240344554186 - timing_s/update_actor:26.79412836022675 - timing_s/update_weights:39.430364187806845 - timing_s/step:156.97537914011627 - timing_s/stop_profile:5.208514630794525e-05 - timing_per_token_ms/adv:5.636325247455024e-05 - timing_per_token_ms/update_actor:0.01750912625358297 - timing_per_token_ms/gen:0.03372293074877896 - timing_per_token_ms/ref:0.02141095640319328 - perf/total_num_tokens:2368878 - perf/time_per_step:156.97537914011627 - perf/throughput:3772.6903622980562 - frontier/active_count:34.0 - frontier/completed_count:30.0 - frontier/blacklisted_count:1773.0 - frontier/mean_score:2.578574604500826 - frontier/mean_frontier_pct:0.6887342632878647 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:2.2538775225395273 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.108509606402089 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.5517524750999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.9932177692715096 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:3.2547537042115096 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:1.9817430332972212 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:144.0 - frontier/cluster_16/score:1.9740900040192997 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:2.3678642315450396 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.9932177692715096 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.234387138501299 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:1.8643604583678524 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3638554314009999 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:2.4824819896111414 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:192.0 - frontier/cluster_27/score:1.5873082228586757 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:160.0 - frontier/cluster_30/score:1.9299681700370737 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.7466337644472862 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:3.385478271143039 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.8657524751 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:2.43379037879233 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9802267325699994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:1.7413248404050368 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:112.0 - frontier/cluster_43/score:2.2840226732569997 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:112.0 - frontier/cluster_45/score:2.957794162438699 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:176.0 - frontier/cluster_46/score:3.6317190138576234 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:144.0 - frontier/cluster_47/score:3.602900392402089 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:112.0 - frontier/cluster_48/score:2.7433504198192993 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:3.8547017868984415 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:128.0 - frontier/cluster_50/score:1.5242240534299998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:112.0 - frontier/cluster_51/score:2.4481636050715094 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:144.0 - frontier/cluster_53/score:2.4599583225395274 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.8220389347592993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:2.7358231799201276 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:4.64153087114304 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:351.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.02570820144319326 - cluster/prob_snapshot/cluster_6:0.021337770753766798 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.024050104393080394 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.029105825852117614 - cluster/prob_snapshot/cluster_12:0.03414127192194314 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.03712440584684943 - cluster/prob_snapshot/cluster_15:0.022604178177013753 - cluster/prob_snapshot/cluster_16:0.022516886114175405 - cluster/prob_snapshot/cluster_17:0.027008357839295286 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03414127192194314 - cluster/prob_snapshot/cluster_21:0.0254858900203013 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.02126528781938474 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.015556422129958593 - cluster/prob_snapshot/cluster_26:0.028315712113812598 - cluster/prob_snapshot/cluster_27:0.0181051717041436 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.022013623188521778 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.019922472368108155 - cluster/prob_snapshot/cluster_33:0.038615477773625353 - cluster/prob_snapshot/cluster_34:0.04409358643739295 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.027760325351663628 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.033993093422828406 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.01986191766300112 - cluster/prob_snapshot/cluster_43:0.026052043377561993 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.03373722280605495 - cluster/prob_snapshot/cluster_46:0.041424151516507066 - cluster/prob_snapshot/cluster_47:0.041095440254122574 - cluster/prob_snapshot/cluster_48:0.031291232339243705 - cluster/prob_snapshot/cluster_49:0.04396753996169472 - cluster/prob_snapshot/cluster_50:0.017385620388985352 - cluster/prob_snapshot/cluster_51:0.027924269396039827 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.028058802426162825 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.03218877010387961 - cluster/prob_snapshot/cluster_57:0.03120537505653692 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.052942278117090055 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 23:45:02,426:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▍     | 352/800 [12:15:04<24:22:31, 195.87s/it]
[36m(TaskRunner pid=2823680)[0m step:352 - global_seqlen/min:456786 - global_seqlen/max:656406 - global_seqlen/minmax_diff:199620 - global_seqlen/balanced_min:554876 - global_seqlen/balanced_max:555115 - global_seqlen/mean:554970.5 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.2011193245292661) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01272720005363226 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.1207513683475554) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0008550108036712466) - actor/ppo_kl:np.float64(0.0001311477974008704) - actor/pg_clipfrac_lower:np.float64(1.6472714784693844e-05) - actor/grad_norm:np.float64(1.1212411398688953) - perf/mfu/actor:np.float64(0.27833887311182265) - perf/max_memory_allocated_gb:np.float64(122.3991756439209) - perf/max_memory_reserved_gb:np.float64(128.71875) - perf/cpu_memory_used_gb:np.float64(680.6712951660156) - actor/lr:np.float64(1e-06) - training/global_step:352 - training/epoch:0 - critic/score/mean:0.5505319237709045 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.583299458026886 - critic/rewards/max:1.5327949523925781 - critic/rewards/min:-0.0805121511220932 - critic/advantages/mean:-0.10700058192014694 - critic/advantages/max:2.47468638420105 - critic/advantages/min:-2.474764347076416 - critic/returns/mean:-0.10700058192014694 - critic/returns/max:2.47468638420105 - critic/returns/min:-2.474764347076416 - response_length/mean:1652.7686767578125 - response_length/max:8192.0 - response_length/min:129.0 - response_length/clip_ratio:0.08377659320831299 - response_length_non_aborted/mean:1652.7686767578125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:129.0 - response_length_non_aborted/clip_ratio:0.08377659320831299 - response/aborted_ratio:0.0 - prompt_length/mean:240.425537109375 - prompt_length/max:804.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:7.81109556555748e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.0848582088947296) - timing_s/agent_loop/generate_sequences/max:np.float64(41.39717549830675) - timing_s/agent_loop/generate_sequences/mean:np.float64(10.205828993582145) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(41.39717549830675) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:195 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:43.85005802009255 - timing_s/reward:0.00013113487511873245 - timing_s/old_log_prob:11.862579398788512 - timing_s/ref:32.96406139526516 - timing_s/adv:0.07869655545800924 - timing_s/update_actor:24.598579617217183 - timing_s/update_weights:36.078896074555814 - timing_s/step:149.85520413704216 - timing_s/stop_profile:7.567275315523148e-05 - timing_per_token_ms/adv:5.5276779124839146e-05 - timing_per_token_ms/update_actor:0.017278141900520752 - timing_per_token_ms/gen:0.0352809502592302 - timing_per_token_ms/ref:0.023154090165686692 - perf/total_num_tokens:2219882 - perf/time_per_step:149.85520413704216 - perf/throughput:3703.3782256402724 - frontier/active_count:31.0 - frontier/completed_count:33.0 - frontier/blacklisted_count:1807.0 - frontier/mean_score:2.5381416905033816 - frontier/mean_frontier_pct:0.6798706382791405 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:12.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:2.2538775225395273 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:128.0 - frontier/cluster_6/score:1.8707151485989995 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.108509606402089 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.5517524750999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.9932177692715096 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:128.0 - frontier/cluster_14/score:3.1783275929480563 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:1.9817430332972212 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:144.0 - frontier/cluster_17/score:2.5575049620815276 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.9952524384900565 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:112.0 - frontier/cluster_21/score:2.234387138501299 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:2.2050523208574964 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3638554314009999 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:160.0 - frontier/cluster_26/score:2.4824819896111414 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:1.411115756001073 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:176.0 - frontier/cluster_30/score:2.2509777190259515 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.7466337644472862 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:3.385478271143039 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.8657524751 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:2.43379037879233 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:80.0 - frontier/cluster_39/score:2.9802267325699994 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:1.7413248404050368 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.4988158712798993 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.9704559137070894 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:128.0 - frontier/cluster_48/score:2.820345293873509 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:192.0 - frontier/cluster_49/score:3.8547017868984415 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:128.0 - frontier/cluster_50/score:1.5242240534299998 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.6137145235500565 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.021970825777669 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.8220389347592993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:160.0 - frontier/cluster_57/score:2.815076225944089 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:160.0 - frontier/cluster_59/score:4.149071609800128 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:352.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.02864525916955946 - cluster/prob_snapshot/cluster_6:0.02377552450306203 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.026797731257748238 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03243104838431716 - cluster/prob_snapshot/cluster_12:0.038041773740706594 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.04039439442262883 - cluster/prob_snapshot/cluster_15:0.02518661383707563 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.03250415860384219 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.03806763301056786 - cluster/prob_snapshot/cluster_21:0.02839754956843605 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.028024723873297255 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.017333680251743942 - cluster/prob_snapshot/cluster_26:0.031550667356605507 - cluster/prob_snapshot/cluster_27:0.0179343270185129 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.028608404627839028 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.02219853401817593 - cluster/prob_snapshot/cluster_33:0.043027139460770636 - cluster/prob_snapshot/cluster_34:0.049131099816744114 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.03093182990987654 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.03787666644917252 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.02213106118365811 - cluster/prob_snapshot/cluster_43:0.0317582599471378 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.03775248594875584 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.03584468148013006 - cluster/prob_snapshot/cluster_49:0.048990653042520566 - cluster/prob_snapshot/cluster_50:0.01937185699149412 - cluster/prob_snapshot/cluster_51:0.03321854411945756 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02569788187622059 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.03586620651049593 - cluster/prob_snapshot/cluster_57:0.035777715189854356 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.052731894429592543 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▍     | 353/800 [12:17:34<22:36:27, 182.08s/it]
[36m(TaskRunner pid=2823680)[0m step:353 - global_seqlen/min:511850 - global_seqlen/max:647965 - global_seqlen/minmax_diff:136115 - global_seqlen/balanced_min:574217 - global_seqlen/balanced_max:574254 - global_seqlen/mean:574236.5 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.1615240993788061) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008998231962323189 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.024706435897314805) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0009696722026210805) - actor/ppo_kl:np.float64(6.090707793389257e-05) - actor/pg_clipfrac_lower:np.float64(3.89250554966882e-05) - actor/grad_norm:np.float64(0.6595018009344736) - perf/mfu/actor:np.float64(0.2831654136414363) - perf/max_memory_allocated_gb:np.float64(122.3991756439209) - perf/max_memory_reserved_gb:np.float64(128.71875) - perf/cpu_memory_used_gb:np.float64(677.984766960144) - actor/lr:np.float64(1e-06) - training/global_step:353 - training/epoch:0 - critic/score/mean:0.6291208863258362 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6661913394927979 - critic/rewards/max:1.5723406076431274 - critic/rewards/min:-0.09392733126878738 - critic/advantages/mean:-0.10766797512769699 - critic/advantages/max:2.4745492935180664 - critic/advantages/min:-2.4747891426086426 - critic/returns/mean:-0.10766797512769699 - critic/returns/max:2.4745492935180664 - critic/returns/min:-2.4747891426086426 - response_length/mean:1653.5439453125 - response_length/max:8192.0 - response_length/min:208.0 - response_length/clip_ratio:0.0714285746216774 - response_length_non_aborted/mean:1653.5439453125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:208.0 - response_length_non_aborted/clip_ratio:0.0714285746216774 - response/aborted_ratio:0.0 - prompt_length/mean:226.95603942871094 - prompt_length/max:337.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.866245090961456e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6299245553091168) - timing_s/agent_loop/generate_sequences/max:np.float64(45.47763509117067) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.089948000012555) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.47763509117067) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:189 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.52853564824909 - timing_s/reward:0.00015817862004041672 - timing_s/old_log_prob:11.025893397629261 - timing_s/ref:27.799579361453652 - timing_s/adv:0.07763708755373955 - timing_s/update_actor:24.902744236402214 - timing_s/update_weights:36.650947079993784 - timing_s/step:148.39235110860318 - timing_s/stop_profile:4.9795955419540405e-05 - timing_per_token_ms/adv:5.671063601986521e-05 - timing_per_token_ms/update_actor:0.018190410134961046 - timing_per_token_ms/gen:0.0394827424016424 - timing_per_token_ms/ref:0.02030642668790862 - perf/total_num_tokens:2296946 - perf/time_per_step:148.39235110860318 - perf/throughput:3869.7176485851105 - frontier/active_count:30.0 - frontier/completed_count:34.0 - frontier/blacklisted_count:1844.0 - frontier/mean_score:2.4774253563057416 - frontier/mean_frontier_pct:0.6961535642207516 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:4.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:2.2538775225395273 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:144.0 - frontier/cluster_6/score:1.6095006040192996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:160.0 - frontier/cluster_9/score:2.108509606402089 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.5517524750999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.9952524384900565 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.7248293150636393 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.287220123308055 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:160.0 - frontier/cluster_17/score:2.0902534734570692 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:112.0 - frontier/cluster_20/score:2.9952524384900565 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:1.8640709969509095 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:208.0 - frontier/cluster_23/score:2.2050523208574964 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:144.0 - frontier/cluster_25/score:1.3638554314009999 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:176.0 - frontier/cluster_26/score:2.6377373927277987 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:1.411115756001073 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:192.0 - frontier/cluster_30/score:1.8756844033181659 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.7466337644472862 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:3.385478271143039 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:80.0 - frontier/cluster_34/score:3.6060267325699997 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:176.0 - frontier/cluster_37/score:2.43379037879233 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.5861587127989996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:160.0 - frontier/cluster_42/score:1.7413248404050368 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:128.0 - frontier/cluster_43/score:2.6491711098959296 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.9704559137070894 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:128.0 - frontier/cluster_48/score:2.820345293873509 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.598291250828909 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:144.0 - frontier/cluster_50/score:1.3669568374009997 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.7296001664850396 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.021970825777669 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.8220389347592993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.870553358160862 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:353.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03032553556461049 - cluster/prob_snapshot/cluster_6:0.021655554625459178 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.028369635181074063 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03433339197088992 - cluster/prob_snapshot/cluster_12:0.04030060792570131 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.050116939690135774 - cluster/prob_snapshot/cluster_15:0.030774154540269518 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.028124002042911052 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.04030060792570131 - cluster/prob_snapshot/cluster_21:0.025080755607918073 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.029668600848659552 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.018350440951794382 - cluster/prob_snapshot/cluster_26:0.03549030429264018 - cluster/prob_snapshot/cluster_27:0.018986320514957033 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.02523701199909042 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.02350065778304365 - cluster/prob_snapshot/cluster_33:0.04555102958704082 - cluster/prob_snapshot/cluster_34:0.04851847131527968 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.03274623219353716 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.048251150516283925 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.02342922712044418 - cluster/prob_snapshot/cluster_43:0.035644142996451904 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.03996697493692322 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.037947262288447164 - cluster/prob_snapshot/cluster_49:0.04841439173495513 - cluster/prob_snapshot/cluster_50:0.01839216983768669 - cluster/prob_snapshot/cluster_51:0.036726302160661035 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.027205270727682227 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.037970049936135244 - cluster/prob_snapshot/cluster_57:0.0386228031836156 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 23:50:01,652:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 23:50:37,492:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▍     | 354/800 [12:20:45<22:53:47, 184.81s/it]
[36m(TaskRunner pid=2823680)[0m step:354 - global_seqlen/min:550525 - global_seqlen/max:766029 - global_seqlen/minmax_diff:215504 - global_seqlen/balanced_min:663505 - global_seqlen/balanced_max:663730 - global_seqlen/mean:663613.25 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.15095309622357694) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00040435537812300026 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.05578063687426038) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012044728901688655) - actor/ppo_kl:np.float64(0.0006487060433444788) - actor/pg_clipfrac_lower:np.float64(1.969893503196452e-05) - actor/grad_norm:np.float64(0.5604449820059997) - perf/mfu/actor:np.float64(0.2650806416326801) - perf/max_memory_allocated_gb:np.float64(122.3991756439209) - perf/max_memory_reserved_gb:np.float64(128.71875) - perf/cpu_memory_used_gb:np.float64(677.842191696167) - actor/lr:np.float64(1e-06) - training/global_step:354 - training/epoch:0 - critic/score/mean:0.5451030731201172 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6278138160705566 - critic/rewards/max:1.6618950366973877 - critic/rewards/min:-0.07835434377193451 - critic/advantages/mean:0.024177348241209984 - critic/advantages/max:2.4737374782562256 - critic/advantages/min:-2.4748477935791016 - critic/returns/mean:0.024177348241209984 - critic/returns/max:2.4737374782562256 - critic/returns/min:-2.4748477935791016 - response_length/mean:2200.0244140625 - response_length/max:8192.0 - response_length/min:219.0 - response_length/clip_ratio:0.1430412381887436 - response_length_non_aborted/mean:2200.0244140625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:219.0 - response_length_non_aborted/clip_ratio:0.1430412381887436 - response/aborted_ratio:0.0 - prompt_length/mean:236.32989501953125 - prompt_length/max:804.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.353365749120712e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.702434972859919) - timing_s/agent_loop/generate_sequences/max:np.float64(46.50940773636103) - timing_s/agent_loop/generate_sequences/mean:np.float64(13.705923938022352) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(46.50940773636103) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:184 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:48.72718522883952 - timing_s/reward:0.00011450238525867462 - timing_s/old_log_prob:14.085607030428946 - timing_s/ref:45.79318190924823 - timing_s/adv:0.10270967520773411 - timing_s/update_actor:31.112254826352 - timing_s/update_weights:49.18872526474297 - timing_s/step:189.4326421385631 - timing_s/stop_profile:5.4135918617248535e-05 - timing_per_token_ms/adv:5.4326180905397306e-05 - timing_per_token_ms/update_actor:0.01645619052589454 - timing_per_token_ms/gen:0.028541848016475635 - timing_per_token_ms/ref:0.024221366483770712 - perf/total_num_tokens:2654453 - perf/time_per_step:189.4326421385631 - perf/throughput:3503.1620871052996 - frontier/active_count:28.0 - frontier/completed_count:36.0 - frontier/blacklisted_count:1874.0 - frontier/mean_score:2.510022748658934 - frontier/mean_frontier_pct:0.7160339706117458 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:160.0 - frontier/cluster_5/score:2.2538775225395273 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:144.0 - frontier/cluster_6/score:1.6095006040192996 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:80.0 - frontier/cluster_11/score:2.5517524750999994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:112.0 - frontier/cluster_12/score:2.9952524384900565 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.5073805205445474 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.287220123308055 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:176.0 - frontier/cluster_17/score:1.7631774314199484 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:128.0 - frontier/cluster_20/score:2.9966767069430396 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.2048496978656367 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:224.0 - frontier/cluster_23/score:2.443536624600247 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:160.0 - frontier/cluster_25/score:1.2546988019806997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:176.0 - frontier/cluster_26/score:2.6377373927277987 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:1.887781029200751 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:192.0 - frontier/cluster_30/score:1.8756844033181659 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:1.7466337644472862 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:144.0 - frontier/cluster_33/score:3.269834789800127 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:96.0 - frontier/cluster_34/score:3.4242187127989996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:192.0 - frontier/cluster_37/score:2.603653265154631 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.5861587127989996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:176.0 - frontier/cluster_42/score:1.5189273882835257 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.7544197769271506 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.9704559137070894 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:2.2742417057114563 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.418803875580236 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:128.0 - frontier/cluster_51/score:2.7296001664850396 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.021970825777669 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.8220389347592993 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.870553358160862 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:354.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03206967978596635 - cluster/prob_snapshot/cluster_6:0.022901053171718273 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.03630804422651093 - cluster/prob_snapshot/cluster_12:0.042618458909107114 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.04990536045395385 - cluster/prob_snapshot/cluster_15:0.032544100653642064 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.025087670055722266 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.04263872435510391 - cluster/prob_snapshot/cluster_21:0.0313720790413959 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.03476827658670467 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.017852695368299885 - cluster/prob_snapshot/cluster_26:0.03753149525575729 - cluster/prob_snapshot/cluster_27:0.026860613545795876 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.026688494646403313 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0248522756755959 - cluster/prob_snapshot/cluster_33:0.04652540060994536 - cluster/prob_snapshot/cluster_34:0.04872207852396821 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.03704652344778438 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.05102626936513151 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.021612316762226625 - cluster/prob_snapshot/cluster_43:0.039191730410735946 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.042265637337552266 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.032359435030825744 - cluster/prob_snapshot/cluster_49:0.04864503259136438 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.03883858036095236 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.02876995589635843 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.040153861102127895 - cluster/prob_snapshot/cluster_57:0.04084415682934908 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▍     | 355/800 [12:23:43<22:35:09, 182.72s/it]
[36m(TaskRunner pid=2823680)[0m step:355 - global_seqlen/min:635224 - global_seqlen/max:758062 - global_seqlen/minmax_diff:122838 - global_seqlen/balanced_min:707529 - global_seqlen/balanced_max:707677 - global_seqlen/mean:707611.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.1558770406163401) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.006466724444180727 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.024599557698820718) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011279011315006452) - actor/ppo_kl:np.float64(-0.00022235577936650466) - actor/pg_clipfrac_lower:np.float64(3.098629648674331e-05) - actor/grad_norm:np.float64(0.6282867267727852) - perf/mfu/actor:np.float64(0.3041777844948711) - perf/max_memory_allocated_gb:np.float64(122.3991756439209) - perf/max_memory_reserved_gb:np.float64(128.71875) - perf/cpu_memory_used_gb:np.float64(678.338005065918) - actor/lr:np.float64(1e-06) - training/global_step:355 - training/epoch:0 - critic/score/mean:0.574438214302063 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6135943531990051 - critic/rewards/max:1.2428454160690308 - critic/rewards/min:-0.09720167517662048 - critic/advantages/mean:-0.09532145410776138 - critic/advantages/max:2.474496841430664 - critic/advantages/min:-2.4748194217681885 - critic/returns/mean:-0.09532145410776138 - critic/returns/max:2.474496841430664 - critic/returns/min:-2.4748194217681885 - response_length/mean:1996.9298095703125 - response_length/max:8192.0 - response_length/min:148.0 - response_length/clip_ratio:0.10393258184194565 - response_length_non_aborted/mean:1996.9298095703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:148.0 - response_length_non_aborted/clip_ratio:0.10393258184194565 - response/aborted_ratio:0.0 - prompt_length/mean:231.64044189453125 - prompt_length/max:398.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.550658822059631e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.174310738220811) - timing_s/agent_loop/generate_sequences/max:np.float64(48.1254544230178) - timing_s/agent_loop/generate_sequences/mean:np.float64(14.666377068022484) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(48.1254544230178) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:50.17538275569677 - timing_s/reward:0.00012695975601673126 - timing_s/old_log_prob:12.610476083122194 - timing_s/ref:39.00496339891106 - timing_s/adv:0.09230272844433784 - timing_s/update_actor:28.98113901913166 - timing_s/update_weights:44.99430892802775 - timing_s/step:176.28211343381554 - timing_s/stop_profile:6.487313657999039e-05 - timing_per_token_ms/adv:5.817122660415987e-05 - timing_per_token_ms/update_actor:0.01826455656882572 - timing_per_token_ms/gen:0.03528969524543771 - timing_per_token_ms/ref:0.02458179300662052 - perf/total_num_tokens:2830446 - perf/time_per_step:176.28211343381554 - perf/throughput:4014.085639299248 - frontier/active_count:26.0 - frontier/completed_count:38.0 - frontier/blacklisted_count:1911.0 - frontier/mean_score:2.4707396824354717 - frontier/mean_frontier_pct:0.7202896464458958 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:176.0 - frontier/cluster_5/score:1.877714265777669 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.4266504228135097 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.6862267325699993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:2.9966767069430396 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:144.0 - frontier/cluster_14/score:3.5073805205445474 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:160.0 - frontier/cluster_15/score:2.287220123308055 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:176.0 - frontier/cluster_17/score:1.7631774314199484 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.2048496978656367 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:160.0 - frontier/cluster_25/score:1.2546988019806997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:176.0 - frontier/cluster_26/score:2.6377373927277987 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:208.0 - frontier/cluster_27/score:1.887781029200751 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:192.0 - frontier/cluster_30/score:1.8756844033181659 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:192.0 - frontier/cluster_32/score:2.1226436351131 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:160.0 - frontier/cluster_33/score:3.1888843528600885 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.6969530989592996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:192.0 - frontier/cluster_37/score:2.7225572856082416 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:96.0 - frontier/cluster_39/score:3.4103110989592995 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:192.0 - frontier/cluster_42/score:1.363249171798468 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.7544197769271506 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:128.0 - frontier/cluster_45/score:2.9704559137070894 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:2.2742417057114563 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:208.0 - frontier/cluster_49/score:3.418803875580236 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.8107201165395272 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:160.0 - frontier/cluster_53/score:2.315379578044368 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:112.0 - frontier/cluster_56/score:2.8754272543315094 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.909387350712603 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:355.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.02923002369144708 - cluster/prob_snapshot/cluster_6:0.02220839795397789 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.041815984713254845 - cluster/prob_snapshot/cluster_12:0.046648700889150144 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.054598730796763355 - cluster/prob_snapshot/cluster_15:0.03560472411075828 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.027447050401614315 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.034322479239407046 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.019531659516011053 - cluster/prob_snapshot/cluster_26:0.04106116030881696 - cluster/prob_snapshot/cluster_27:0.029386731098274506 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.029198425205530967 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.03304279297103753 - cluster/prob_snapshot/cluster_33:0.049640761047731186 - cluster/prob_snapshot/cluster_34:0.04198296003500463 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0423815355776146 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.053087669425838126 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.021221442641866265 - cluster/prob_snapshot/cluster_43:0.042877532968215726 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0462405267481405 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.03540269153277765 - cluster/prob_snapshot/cluster_49:0.05321987487709369 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.04375394973230987 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.03604307702956074 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.04476123353748565 - cluster/prob_snapshot/cluster_57:0.04528988395031728 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-12 23:56:07,173:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  44%|████▍     | 356/800 [12:26:52<22:45:20, 184.51s/it]
[36m(RewardLoopWorker pid=2826758)[0m WARNING:2026-04-12 23:56:09,009:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m step:356 - global_seqlen/min:585923 - global_seqlen/max:647549 - global_seqlen/minmax_diff:61626 - global_seqlen/balanced_min:610367 - global_seqlen/balanced_max:610455 - global_seqlen/mean:610408.75 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.14845798752859843) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.009484363719820976 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.08766724716406316) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012774651402985317) - actor/ppo_kl:np.float64(0.0018764667664035355) - actor/pg_clipfrac_lower:np.float64(2.6709894622659442e-05) - actor/grad_norm:np.float64(0.6126260459423065) - perf/mfu/actor:np.float64(0.2709341318639173) - perf/max_memory_allocated_gb:np.float64(122.3991756439209) - perf/max_memory_reserved_gb:np.float64(128.71875) - perf/cpu_memory_used_gb:np.float64(105.2249526977539) - actor/lr:np.float64(1e-06) - training/global_step:356 - training/epoch:0 - critic/score/mean:0.6146907210350037 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6556487083435059 - critic/rewards/max:1.2285627126693726 - critic/rewards/min:-0.10937846451997757 - critic/advantages/mean:-0.1669447422027588 - critic/advantages/max:2.467963457107544 - critic/advantages/min:-2.4746339321136475 - critic/returns/mean:-0.1669447422027588 - critic/returns/max:2.467963457107544 - critic/returns/min:-2.4746339321136475 - response_length/mean:1973.3736572265625 - response_length/max:8192.0 - response_length/min:148.0 - response_length/clip_ratio:0.10180412232875824 - response_length_non_aborted/mean:1973.3736572265625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:148.0 - response_length_non_aborted/clip_ratio:0.10180412232875824 - response/aborted_ratio:0.0 - prompt_length/mean:239.0309295654297 - prompt_length/max:500.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.44346359372139e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.2056822087615728) - timing_s/agent_loop/generate_sequences/max:np.float64(45.114277065731585) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.726287868518739) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.114277065731585) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:176 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.94770328141749 - timing_s/reward:0.00015372689813375473 - timing_s/old_log_prob:39.48887384403497 - timing_s/ref:34.25830522645265 - timing_s/adv:0.08632917236536741 - timing_s/update_actor:27.721436835825443 - timing_s/update_weights:37.20056855585426 - timing_s/step:187.20910695381463 - timing_s/stop_profile:6.262212991714478e-05 - timing_per_token_ms/adv:5.028417111889464e-05 - timing_per_token_ms/update_actor:0.01614691112309893 - timing_per_token_ms/gen:0.03131098639321789 - timing_per_token_ms/ref:0.0199544422244611 - perf/total_num_tokens:2441635 - perf/time_per_step:187.20910695381463 - perf/throughput:3260.571880996103 - frontier/active_count:21.0 - frontier/completed_count:43.0 - frontier/blacklisted_count:1941.0 - frontier/mean_score:2.35084117418989 - frontier/mean_frontier_pct:0.6852035003714028 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:5.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:176.0 - frontier/cluster_5/score:1.877714265777669 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.4266504228135097 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:96.0 - frontier/cluster_11/score:2.6862267325699993 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:2.9966767069430396 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.755166364381183 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:176.0 - frontier/cluster_17/score:1.7631774314199484 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.2048496978656367 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:160.0 - frontier/cluster_25/score:1.2546988019806997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:176.0 - frontier/cluster_26/score:2.746416174909459 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:208.0 - frontier/cluster_30/score:1.6129790823227161 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:208.0 - frontier/cluster_32/score:1.78585054457917 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.6969530989592996 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.287217769271509 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:192.0 - frontier/cluster_42/score:1.363249171798468 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:144.0 - frontier/cluster_43/score:2.828093843849005 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:2.3793191395949624 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:2.4919691939980195 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:144.0 - frontier/cluster_51/score:2.867504081577669 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.5207657046310574 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:2.9127990780320565 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:176.0 - frontier/cluster_57/score:2.909387350712603 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:356.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03803530668882582 - cluster/prob_snapshot/cluster_6:0.028898479048931024 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.05441267581077218 - cluster/prob_snapshot/cluster_12:0.06070120447672781 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.05580912898085401 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.03571522865493023 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.04466181888773813 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.025415397116170645 - cluster/prob_snapshot/cluster_26:0.05563188362131869 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.032672784777185854 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.03617449917777748 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.05462995095359311 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.06658645475828959 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.027614212283341103 - cluster/prob_snapshot/cluster_43:0.05728636068652721 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.04819590223840958 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.05047776132944581 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0580846613151207 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.051061068456338204 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.05900216463978038 - cluster/prob_snapshot/cluster_57:0.05893305609792227 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  45%|████▍     | 357/800 [12:29:34<21:52:24, 177.75s/it]
[36m(TaskRunner pid=2823680)[0m step:357 - global_seqlen/min:598422 - global_seqlen/max:682327 - global_seqlen/minmax_diff:83905 - global_seqlen/balanced_min:636196 - global_seqlen/balanced_max:636228 - global_seqlen/mean:636211.5 - frontier/skipped_zero_acc_count:32.0 - actor/entropy:np.float64(0.15538516962745538) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.00042708555702120066 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.042397256271215156) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.001979006890602856) - actor/ppo_kl:np.float64(-0.0010793838359456724) - actor/pg_clipfrac_lower:np.float64(0.00021149862000887273) - actor/grad_norm:np.float64(0.8860711542268594) - perf/mfu/actor:np.float64(0.283745558038918) - perf/max_memory_allocated_gb:np.float64(122.3991756439209) - perf/max_memory_reserved_gb:np.float64(128.71875) - perf/cpu_memory_used_gb:np.float64(104.90157699584961) - actor/lr:np.float64(1e-06) - training/global_step:357 - training/epoch:0 - critic/score/mean:0.5520833134651184 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6158725619316101 - critic/rewards/max:1.6185470819473267 - critic/rewards/min:-0.10170149803161621 - critic/advantages/mean:-0.03386874124407768 - critic/advantages/max:2.4673571586608887 - critic/advantages/min:-2.474778413772583 - critic/returns/mean:-0.03386874124407768 - critic/returns/max:2.4673571586608887 - critic/returns/min:-2.474778413772583 - response_length/mean:2018.5078125 - response_length/max:8192.0 - response_length/min:250.0 - response_length/clip_ratio:0.1028645858168602 - response_length_non_aborted/mean:2018.5078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:250.0 - response_length_non_aborted/clip_ratio:0.1028645858168602 - response/aborted_ratio:0.0 - prompt_length/mean:241.5833282470703 - prompt_length/max:728.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.348282426595688e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7872890690341592) - timing_s/agent_loop/generate_sequences/max:np.float64(45.51828058157116) - timing_s/agent_loop/generate_sequences/mean:np.float64(12.276311018650631) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.51828058157116) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:192 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:48.11026466451585 - timing_s/reward:0.00013517122715711594 - timing_s/old_log_prob:12.281873723492026 - timing_s/ref:34.25628848653287 - timing_s/adv:0.07345640193670988 - timing_s/update_actor:27.58997064642608 - timing_s/update_weights:37.64588801283389 - timing_s/step:160.37377746403217 - timing_s/stop_profile:7.082056254148483e-05 - timing_per_token_ms/adv:4.231969001106719e-05 - timing_per_token_ms/update_actor:0.015895129279231503 - timing_per_token_ms/gen:0.031034595652287913 - timing_per_token_ms/ref:0.019735727199500428 - perf/total_num_tokens:2544846 - perf/time_per_step:160.37377746403217 - perf/throughput:3967.0544029100165 - frontier/active_count:20.0 - frontier/completed_count:44.0 - frontier/blacklisted_count:1973.0 - frontier/mean_score:2.37227472523779 - frontier/mean_frontier_pct:0.7206237879288222 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:192.0 - frontier/cluster_5/score:1.6143999860443683 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:160.0 - frontier/cluster_6/score:1.4266504228135097 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.1803587127989994 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:2.9976736948601275 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:160.0 - frontier/cluster_14/score:2.828616455066828 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:176.0 - frontier/cluster_17/score:1.7631774314199484 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:128.0 - frontier/cluster_21/score:2.2048496978656367 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:160.0 - frontier/cluster_25/score:1.2546988019806997 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:192.0 - frontier/cluster_26/score:2.222491322436621 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:208.0 - frontier/cluster_30/score:2.029085357625901 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:224.0 - frontier/cluster_32/score:1.550095381205419 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:112.0 - frontier/cluster_34/score:2.7878671692715096 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:112.0 - frontier/cluster_39/score:3.201052438490056 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.8796656906943032 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:144.0 - frontier/cluster_45/score:2.5655233977164738 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:144.0 - frontier/cluster_48/score:2.4919691939980195 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:2.907252857104368 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:176.0 - frontier/cluster_53/score:2.6645359932417403 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:128.0 - frontier/cluster_56/score:2.9389593546224395 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:192.0 - frontier/cluster_57/score:2.936571145498822 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:357.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.034026412895381365 - cluster/prob_snapshot/cluster_6:0.030069249729718944 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.04595502134729457 - cluster/prob_snapshot/cluster_12:0.06318141956681787 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.059618231079524235 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.037162167869136925 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.046471213355034784 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.026445056903241518 - cluster/prob_snapshot/cluster_26:0.04684304264578474 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.042766660539759194 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.03267107651391519 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.05875936584435983 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0674679961059146 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0606941864712919 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0540730668843448 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.05252277840097613 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0612756361262702 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.056159937230175885 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.06194390816872711 - cluster/prob_snapshot/cluster_57:0.061893572322330184 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  45%|████▍     | 358/800 [12:32:27<21:40:14, 176.50s/it]
[36m(TaskRunner pid=2823680)[0m step:358 - global_seqlen/min:628214 - global_seqlen/max:749781 - global_seqlen/minmax_diff:121567 - global_seqlen/balanced_min:701168 - global_seqlen/balanced_max:701259 - global_seqlen/mean:701217.25 - frontier/skipped_zero_acc_count:37.0 - actor/entropy:np.float64(0.15337773373998378) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.0019813082180917263 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.12368212023284286) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.002274246837022593) - actor/ppo_kl:np.float64(0.002291734736939796) - actor/pg_clipfrac_lower:np.float64(0.00022487437706292005) - actor/grad_norm:np.float64(0.5890385148425897) - perf/mfu/actor:np.float64(0.2923296576140454) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(104.90335845947266) - actor/lr:np.float64(1e-06) - training/global_step:358 - training/epoch:0 - critic/score/mean:0.5645604133605957 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6504928469657898 - critic/rewards/max:1.741054654121399 - critic/rewards/min:-0.10743681341409683 - critic/advantages/mean:-0.1056695282459259 - critic/advantages/max:2.4739010334014893 - critic/advantages/min:-2.474306106567383 - critic/returns/mean:-0.1056695282459259 - critic/returns/max:2.4739010334014893 - critic/returns/min:-2.474306106567383 - response_length/mean:2259.585205078125 - response_length/max:8192.0 - response_length/min:177.0 - response_length/clip_ratio:0.1318681389093399 - response_length_non_aborted/mean:2259.585205078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:177.0 - response_length_non_aborted/clip_ratio:0.1318681389093399 - response/aborted_ratio:0.0 - prompt_length/mean:239.63735961914062 - prompt_length/max:418.0 - prompt_length/min:181.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.425768464803696e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3857422564178705) - timing_s/agent_loop/generate_sequences/max:np.float64(46.95578816905618) - timing_s/agent_loop/generate_sequences/mean:np.float64(14.155856309341289) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(46.95578816905618) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:48.97237897012383 - timing_s/reward:0.00012899283319711685 - timing_s/old_log_prob:12.175218901596963 - timing_s/ref:38.26053964253515 - timing_s/adv:0.07153126411139965 - timing_s/update_actor:29.810885568149388 - timing_s/update_weights:41.11632108874619 - timing_s/step:170.8032503342256 - timing_s/stop_profile:5.742069333791733e-05 - timing_per_token_ms/adv:3.931511893885662e-05 - timing_per_token_ms/update_actor:0.016384702917582823 - timing_per_token_ms/gen:0.029770841294001395 - timing_per_token_ms/ref:0.02102881425901415 - perf/total_num_tokens:2804869 - perf/time_per_step:170.8032503342256 - perf/throughput:4105.409285993487 - frontier/active_count:18.0 - frontier/completed_count:46.0 - frontier/blacklisted_count:2007.0 - frontier/mean_score:2.328059841452039 - frontier/mean_frontier_pct:0.724678282953628 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:5.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:208.0 - frontier/cluster_5/score:1.4300799902310577 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:176.0 - frontier/cluster_6/score:1.2986552959694568 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:112.0 - frontier/cluster_11/score:2.426251098959299 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:128.0 - frontier/cluster_12/score:2.9976736948601275 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:176.0 - frontier/cluster_14/score:2.2800315185467794 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:176.0 - frontier/cluster_17/score:1.7631774314199484 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:144.0 - frontier/cluster_21/score:2.4433947885059455 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:160.0 - frontier/cluster_25/score:1.7782891613864895 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:192.0 - frontier/cluster_26/score:2.455743925705635 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:208.0 - frontier/cluster_30/score:2.029085357625901 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:224.0 - frontier/cluster_32/score:1.550095381205419 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:128.0 - frontier/cluster_39/score:3.140736706943039 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:160.0 - frontier/cluster_43/score:2.915765983486012 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:160.0 - frontier/cluster_45/score:2.695866378401531 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:160.0 - frontier/cluster_48/score:2.0443784357986137 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:160.0 - frontier/cluster_51/score:2.9350769999730573 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:192.0 - frontier/cluster_53/score:2.765175195269218 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:192.0 - frontier/cluster_57/score:2.9555998018491754 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:358.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03412665212961908 - cluster/prob_snapshot/cluster_6:0.03099040460993835 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.05789873839148819 - cluster/prob_snapshot/cluster_12:0.07153485684816328 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.05440943374464064 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0420755085421075 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.05830784608712279 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.042436126657994545 - cluster/prob_snapshot/cluster_26:0.05860253918974193 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.04842099086346548 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.036990634232690466 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0749488348629037 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.06958025571263796 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.06433269097680369 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.04878593657444414 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0700410833211805 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.06598663893699917 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.07053082831805876 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  45%|████▍     | 359/800 [12:34:46<20:14:20, 165.22s/it]
[36m(TaskRunner pid=2823680)[0m step:359 - global_seqlen/min:539558 - global_seqlen/max:716443 - global_seqlen/minmax_diff:176885 - global_seqlen/balanced_min:590601 - global_seqlen/balanced_max:590651 - global_seqlen/mean:590633.0 - frontier/skipped_zero_acc_count:44.0 - actor/entropy:np.float64(0.14905685401477275) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0053795925341546535 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.04593928659596713) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0010363008472527976) - actor/ppo_kl:np.float64(0.00023910239279768488) - actor/pg_clipfrac_lower:np.float64(2.5699245573681157e-05) - actor/grad_norm:np.float64(0.9590902789072557) - perf/mfu/actor:np.float64(0.3283622481347057) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(105.67305755615234) - actor/lr:np.float64(1e-06) - training/global_step:359 - training/epoch:0 - critic/score/mean:0.5684523582458496 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6190506815910339 - critic/rewards/max:1.4390990734100342 - critic/rewards/min:-0.10620615631341934 - critic/advantages/mean:0.01948407292366028 - critic/advantages/max:2.4713854789733887 - critic/advantages/min:-2.4746437072753906 - critic/returns/mean:0.01948407292366028 - critic/returns/max:2.4713854789733887 - critic/returns/min:-2.4746437072753906 - response_length/mean:1791.6502685546875 - response_length/max:8192.0 - response_length/min:206.0 - response_length/clip_ratio:0.0833333358168602 - response_length_non_aborted/mean:1791.6502685546875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:206.0 - response_length_non_aborted/clip_ratio:0.0833333358168602 - response/aborted_ratio:0.0 - prompt_length/mean:242.0833282470703 - prompt_length/max:398.0 - prompt_length/min:179.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.286628872156143e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6308277370408177) - timing_s/agent_loop/generate_sequences/max:np.float64(40.95897307712585) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.045260780466378) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(40.95897307712585) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:193 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:43.16880865953863 - timing_s/reward:0.00014488399028778076 - timing_s/old_log_prob:10.92133068293333 - timing_s/ref:27.442607549019158 - timing_s/adv:0.07911441847681999 - timing_s/update_actor:22.16936441604048 - timing_s/update_weights:32.827489339746535 - timing_s/step:137.00575411319733 - timing_s/stop_profile:5.254894495010376e-05 - timing_per_token_ms/adv:5.7888500051453565e-05 - timing_per_token_ms/update_actor:0.016221458462905413 - timing_per_token_ms/gen:0.035854819819399204 - timing_per_token_ms/ref:0.020079922460390306 - perf/total_num_tokens:2362532 - perf/time_per_step:137.00575411319733 - perf/throughput:4311.008715093859 - frontier/active_count:18.0 - frontier/completed_count:46.0 - frontier/blacklisted_count:2051.0 - frontier/mean_score:2.251781631392959 - frontier/mean_frontier_pct:0.7834315929254552 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:8.0 - frontier/force_completed_count:6.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:224.0 - frontier/cluster_5/score:1.3010559931617405 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:192.0 - frontier/cluster_6/score:1.2090587071786196 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.5983757692715095 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:2.9983715864020892 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:192.0 - frontier/cluster_14/score:1.8960220629827456 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:192.0 - frontier/cluster_17/score:1.5342242019939638 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.0103763519541618 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:160.0 - frontier/cluster_25/score:1.7782891613864895 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:208.0 - frontier/cluster_26/score:2.619020747993944 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:224.0 - frontier/cluster_30/score:1.7203597503381307 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:224.0 - frontier/cluster_32/score:1.550095381205419 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.498515694860127 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.9410361884402083 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:160.0 - frontier/cluster_45/score:2.7871064648810715 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:176.0 - frontier/cluster_48/score:1.7310649050590294 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:176.0 - frontier/cluster_51/score:2.95455389998114 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:208.0 - frontier/cluster_53/score:3.4356226366884526 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.9689198612944225 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:359.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.03209942185391769 - cluster/prob_snapshot/cluster_6:0.029829681191172366 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.06410666442583723 - cluster/prob_snapshot/cluster_12:0.07397529002024764 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.046778318814794086 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.03785210639444959 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.04959964747535234 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.04387363362500441 - cluster/prob_snapshot/cluster_26:0.06461601366573133 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.042444409508007395 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.03824367730262363 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.061642934446695524 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.07256072128837611 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.06876299455074797 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.042708525179587765 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.07289422786113896 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0847630700950332 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.07324866230128282 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  45%|████▌     | 360/800 [12:37:16<19:37:39, 160.59s/it]
[36m(TaskRunner pid=2823680)[0m step:360 - global_seqlen/min:501147 - global_seqlen/max:738320 - global_seqlen/minmax_diff:237173 - global_seqlen/balanced_min:627095 - global_seqlen/balanced_max:627325 - global_seqlen/mean:627200.75 - frontier/skipped_zero_acc_count:43.0 - actor/entropy:np.float64(0.17333824425762476) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.001376901171170175 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.03220729711392778) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012146219002694802) - actor/ppo_kl:np.float64(-0.00020182482695731304) - actor/pg_clipfrac_lower:np.float64(1.8550890037955087e-05) - actor/grad_norm:np.float64(0.8036435327746652) - perf/mfu/actor:np.float64(0.32083272489246467) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(104.80972671508789) - actor/lr:np.float64(1e-06) - training/global_step:360 - training/epoch:0 - critic/score/mean:0.5514705777168274 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6120951771736145 - critic/rewards/max:1.9954965114593506 - critic/rewards/min:-0.27799636125564575 - critic/advantages/mean:-0.02787894569337368 - critic/advantages/max:2.4740967750549316 - critic/advantages/min:-2.4741146564483643 - critic/returns/mean:-0.02787894569337368 - critic/returns/max:2.4740967750549316 - critic/returns/min:-2.4741146564483643 - response_length/mean:1864.548583984375 - response_length/max:8192.0 - response_length/min:177.0 - response_length/clip_ratio:0.08970588445663452 - response_length_non_aborted/mean:1864.548583984375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:177.0 - response_length_non_aborted/clip_ratio:0.08970588445663452 - response/aborted_ratio:0.0 - prompt_length/mean:249.6117706298828 - prompt_length/max:817.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.0001186467707157135 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.379558932967484) - timing_s/agent_loop/generate_sequences/max:np.float64(45.7631041072309) - timing_s/agent_loop/generate_sequences/mean:np.float64(12.005660980086759) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.7631041072309) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.78561102692038 - timing_s/reward:0.0002133157104253769 - timing_s/old_log_prob:11.840436233207583 - timing_s/ref:29.401220614090562 - timing_s/adv:0.0759937223047018 - timing_s/update_actor:24.036002955399454 - timing_s/update_weights:34.47095494251698 - timing_s/step:148.01518084760755 - timing_s/stop_profile:4.956871271133423e-05 - timing_per_token_ms/adv:5.2860454473791085e-05 - timing_per_token_ms/update_actor:0.01671919734187294 - timing_per_token_ms/gen:0.03768899349307897 - timing_per_token_ms/ref:0.02045118776408278 - perf/total_num_tokens:2508803 - perf/time_per_step:148.01518084760755 - perf/throughput:4237.408260479369 - frontier/active_count:14.0 - frontier/completed_count:50.0 - frontier/blacklisted_count:2090.0 - frontier/mean_score:2.455070430417382 - frontier/mean_frontier_pct:0.7728880678210043 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:7.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:128.0 - frontier/cluster_11/score:2.7188630384900563 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:144.0 - frontier/cluster_12/score:2.9988601104814623 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:160.0 - frontier/cluster_21/score:2.307263446367913 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:176.0 - frontier/cluster_25/score:1.5448024129705427 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:208.0 - frontier/cluster_26/score:2.7333145235957605 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:224.0 - frontier/cluster_30/score:1.7203597503381307 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:240.0 - frontier/cluster_32/score:1.3850667668437933 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:144.0 - frontier/cluster_39/score:2.6489609864020887 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:176.0 - frontier/cluster_43/score:2.9587253319081457 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:176.0 - frontier/cluster_45/score:2.25097452541675 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:176.0 - frontier/cluster_48/score:1.7310649050590294 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:176.0 - frontier/cluster_51/score:2.968187729986798 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:208.0 - frontier/cluster_53/score:3.4356226366884526 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.9689198612944225 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:360.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.07910343440382417 - cluster/prob_snapshot/cluster_12:0.0872497550179863 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.06712822974101165 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.04494495478858287 - cluster/prob_snapshot/cluster_26:0.07952389033996864 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.05005267375933301 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.040297556951155535 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.07706968267975642 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.08608206147136709 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.06549054262581397 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.05036413281124527 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0863573633807024 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.0999570577959337 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.08637866423331902 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  45%|████▌     | 361/800 [12:39:58<19:37:48, 160.98s/it]
[36m(TaskRunner pid=2823680)[0m step:361 - global_seqlen/min:545225 - global_seqlen/max:765929 - global_seqlen/minmax_diff:220704 - global_seqlen/balanced_min:645163 - global_seqlen/balanced_max:645251 - global_seqlen/mean:645208.25 - frontier/skipped_zero_acc_count:34.0 - actor/entropy:np.float64(0.1527084691667969) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0033128715585917234 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.027121616789372638) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012692465684364272) - actor/ppo_kl:np.float64(0.00010129040360928403) - actor/pg_clipfrac_lower:np.float64(1.8184678083343897e-05) - actor/grad_norm:np.float64(0.6218508332967758) - perf/mfu/actor:np.float64(0.29894878129550306) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(104.95020961761475) - actor/lr:np.float64(1e-06) - training/global_step:361 - training/epoch:0 - critic/score/mean:0.6289893388748169 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6914225816726685 - critic/rewards/max:2.4261136054992676 - critic/rewards/min:-0.09346099942922592 - critic/advantages/mean:-0.1622423529624939 - critic/advantages/max:2.4737606048583984 - critic/advantages/min:-2.4741790294647217 - critic/returns/mean:-0.1622423529624939 - critic/returns/max:2.4737606048583984 - critic/returns/min:-2.4741790294647217 - response_length/mean:2063.53466796875 - response_length/max:8192.0 - response_length/min:198.0 - response_length/clip_ratio:0.11702127754688263 - response_length_non_aborted/mean:2063.53466796875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:198.0 - response_length_non_aborted/clip_ratio:0.11702127754688263 - response/aborted_ratio:0.0 - prompt_length/mean:245.20213317871094 - prompt_length/max:684.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:9.35262069106102e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.035230745561421) - timing_s/agent_loop/generate_sequences/max:np.float64(45.35646855831146) - timing_s/agent_loop/generate_sequences/mean:np.float64(12.716040226869154) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.35646855831146) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:183 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.43646267335862 - timing_s/reward:0.00014294497668743134 - timing_s/old_log_prob:12.300729544833302 - timing_s/ref:34.81710353773087 - timing_s/adv:0.090734688565135 - timing_s/update_actor:26.77462015952915 - timing_s/update_weights:38.14960778225213 - timing_s/step:160.11253299936652 - timing_s/stop_profile:5.779881030321121e-05 - timing_per_token_ms/adv:5.226140790656157e-05 - timing_per_token_ms/update_actor:0.015421658109245725 - timing_per_token_ms/gen:0.030569103746385513 - timing_per_token_ms/ref:0.020053971407022857 - perf/total_num_tokens:2580833 - perf/time_per_step:160.11253299936652 - perf/throughput:4029.7173363846086 - frontier/active_count:13.0 - frontier/completed_count:51.0 - frontier/blacklisted_count:2123.0 - frontier/mean_score:2.5271656873125443 - frontier/mean_frontier_pct:0.8005935012637492 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:9.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:8.0 - frontier/replay_slots_count:16.0 - frontier/replay_pool_size:4873.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:144.0 - frontier/cluster_11/score:2.803204126943039 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:2.9992020773370234 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:2.515084412457539 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:176.0 - frontier/cluster_25/score:1.9813616890793797 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:224.0 - frontier/cluster_30/score:1.7203597503381307 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:256.0 - frontier/cluster_32/score:1.2695467367906552 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.7542726904814616 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:3.571107732335702 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:176.0 - frontier/cluster_45/score:2.4756821677917245 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:192.0 - frontier/cluster_48/score:1.5117454335413205 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:192.0 - frontier/cluster_51/score:2.977731410990758 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:208.0 - frontier/cluster_53/score:3.3049358456819165 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.9689198612944225 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:361.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.08532526686733942 - cluster/prob_snapshot/cluster_12:0.09129114614886563 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.07655534130539818 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.060309633985087145 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.05236513224083634 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.03864307029090776 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.08383586841998503 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.1086990837894677 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.07535599695192464 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.04601522996937853 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.09063761174578508 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.10059721669993664 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.0903694015850878 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  45%|████▌     | 362/800 [12:42:25<19:05:23, 156.90s/it]
[36m(TaskRunner pid=2823680)[0m step:362 - global_seqlen/min:598565 - global_seqlen/max:749527 - global_seqlen/minmax_diff:150962 - global_seqlen/balanced_min:673062 - global_seqlen/balanced_max:673120 - global_seqlen/mean:673087.0 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.15719484904995903) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.0010757158743217587 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.061411284947098466) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00119751543080873) - actor/ppo_kl:np.float64(0.0001673987260237307) - actor/pg_clipfrac_lower:np.float64(2.1235842810850816e-05) - actor/grad_norm:np.float64(0.9479782466705029) - perf/mfu/actor:np.float64(0.2789337242295231) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(105.01029586791992) - actor/lr:np.float64(1e-06) - training/global_step:362 - training/epoch:0 - critic/score/mean:0.5745192170143127 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6503458619117737 - critic/rewards/max:1.386474609375 - critic/rewards/min:-0.2864123284816742 - critic/advantages/mean:-0.1273346096277237 - critic/advantages/max:2.473999261856079 - critic/advantages/min:-2.474684000015259 - critic/returns/mean:-0.1273346096277237 - critic/returns/max:2.473999261856079 - critic/returns/min:-2.474684000015259 - response_length/mean:2168.0830078125 - response_length/max:8192.0 - response_length/min:102.0 - response_length/clip_ratio:0.12860576808452606 - response_length_non_aborted/mean:2168.0830078125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:102.0 - response_length_non_aborted/clip_ratio:0.12860576808452606 - response/aborted_ratio:0.0 - prompt_length/mean:236.02883911132812 - prompt_length/max:447.0 - prompt_length/min:185.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.41999426484108e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.8655981421470642) - timing_s/agent_loop/generate_sequences/max:np.float64(46.06143701355904) - timing_s/agent_loop/generate_sequences/mean:np.float64(13.468046083757145) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(46.06143701355904) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:175 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.93489502929151 - timing_s/reward:0.00015191640704870224 - timing_s/old_log_prob:13.143613319844007 - timing_s/ref:24.55420507211238 - timing_s/adv:0.09095886815339327 - timing_s/update_actor:29.995998385362327 - timing_s/update_weights:29.26540463976562 - timing_s/step:145.41489955037832 - timing_s/stop_profile:5.133636295795441e-05 - timing_per_token_ms/adv:4.5474409154485065e-05 - timing_per_token_ms/update_actor:0.014996342096879459 - timing_per_token_ms/gen:0.026573732792613283 - timing_per_token_ms/ref:0.012275746066115884 - perf/total_num_tokens:2692348 - perf/time_per_step:145.41489955037832 - perf/throughput:4628.734758825812 - frontier/active_count:12.0 - frontier/completed_count:52.0 - frontier/blacklisted_count:2145.0 - frontier/mean_score:2.7096090768567045 - frontier/mean_frontier_pct:0.8159619369865201 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:24.0 - frontier/replay_pool_size:5027.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:160.0 - frontier/cluster_11/score:2.2622428888601274 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:160.0 - frontier/cluster_12/score:2.999441454135916 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:176.0 - frontier/cluster_21/score:2.6605590887202775 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:192.0 - frontier/cluster_25/score:2.2869531823555658 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:224.0 - frontier/cluster_30/score:1.7203597503381307 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:160.0 - frontier/cluster_39/score:2.827990883337023 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:192.0 - frontier/cluster_43/score:3.399775412634991 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:192.0 - frontier/cluster_45/score:2.632977517454207 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:192.0 - frontier/cluster_48/score:1.9582218034789243 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:3.5844119876935308 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:224.0 - frontier/cluster_53/score:3.2134550919773415 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.9689198612944225 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:362.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.06957470077456257 - cluster/prob_snapshot/cluster_12:0.09224705388176736 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.08182481350798988 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.07033465952367064 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.052909223604493856 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.0869741354786637 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.10455922226546537 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.08097654934626848 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.060224610141627555 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.11023767285315211 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.09882898851301969 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.09130837010931889 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(RewardLoopWorker pid=2826762)[0m WARNING:2026-04-13 00:15:24,728:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  45%|████▌     | 363/800 [12:45:03<19:05:14, 157.24s/it]
[36m(TaskRunner pid=2823680)[0m step:363 - global_seqlen/min:571079 - global_seqlen/max:668052 - global_seqlen/minmax_diff:96973 - global_seqlen/balanced_min:616935 - global_seqlen/balanced_max:617052 - global_seqlen/mean:616993.0 - frontier/skipped_zero_acc_count:33.0 - actor/entropy:np.float64(0.15763656183844432) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.004231135826557875 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.021490251925570192) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011733820163802495) - actor/ppo_kl:np.float64(0.00012823522064309145) - actor/pg_clipfrac_lower:np.float64(2.0429462855039066e-05) - actor/grad_norm:np.float64(0.7801044136285782) - perf/mfu/actor:np.float64(0.282116319817106) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(104.95743560791016) - actor/lr:np.float64(1e-06) - training/global_step:363 - training/epoch:0 - critic/score/mean:0.5828947424888611 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6343224048614502 - critic/rewards/max:1.5144277811050415 - critic/rewards/min:-0.09610498696565628 - critic/advantages/mean:-0.15820889174938202 - critic/advantages/max:2.474224328994751 - critic/advantages/min:-2.4748036861419678 - critic/returns/mean:-0.15820889174938202 - critic/returns/max:2.474224328994751 - critic/returns/min:-2.4748036861419678 - response_length/mean:1929.37890625 - response_length/max:8192.0 - response_length/min:159.0 - response_length/clip_ratio:0.09736841917037964 - response_length_non_aborted/mean:1929.37890625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:159.0 - response_length_non_aborted/clip_ratio:0.09736841917037964 - response/aborted_ratio:0.0 - prompt_length/mean:249.6210479736328 - prompt_length/max:817.0 - prompt_length/min:184.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.371938019990921e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.3762024072930217) - timing_s/agent_loop/generate_sequences/max:np.float64(45.84355209302157) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.917370539322292) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.84355209302157) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:185 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:48.20287906844169 - timing_s/reward:0.0002717087045311928 - timing_s/old_log_prob:12.441544201225042 - timing_s/ref:31.6945414301008 - timing_s/adv:0.08458139840513468 - timing_s/update_actor:26.940303252078593 - timing_s/update_weights:36.358119586482644 - timing_s/step:156.1128256842494 - timing_s/stop_profile:6.214529275894165e-05 - timing_per_token_ms/adv:5.107448999126511e-05 - timing_per_token_ms/update_actor:0.016267906120672564 - timing_per_token_ms/gen:0.03287319008328402 - timing_per_token_ms/ref:0.01913875355069974 - perf/total_num_tokens:2467972 - perf/time_per_step:156.1128256842494 - perf/throughput:3952.224919994193 - frontier/active_count:11.0 - frontier/completed_count:53.0 - frontier/blacklisted_count:2178.0 - frontier/mean_score:2.6004644333619895 - frontier/mean_frontier_pct:0.8448034519242708 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:32.0 - frontier/replay_pool_size:5026.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:1.883570022202089 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:2.999609017895141 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:192.0 - frontier/cluster_25/score:2.5008672276488957 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:224.0 - frontier/cluster_30/score:1.7203597503381307 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.8795936183359157 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:208.0 - frontier/cluster_43/score:3.2798427888444937 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:208.0 - frontier/cluster_45/score:2.1430842622179447 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:208.0 - frontier/cluster_48/score:2.2707552624352467 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:208.0 - frontier/cluster_51/score:3.4090883913854713 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:240.0 - frontier/cluster_53/score:2.5494185643841387 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.9689198612944225 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:363.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.06584732949438192 - cluster/prob_snapshot/cluster_12:0.10486270275460409 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.08742729307624868 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.060141695819171156 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.10066710956400046 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.11465933639903962 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.07491963340099758 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.07938285712992356 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.11917760632046577 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.08912458907783859 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.10378984696332871 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  46%|████▌     | 364/800 [12:47:51<19:24:07, 160.20s/it]
[36m(TaskRunner pid=2823680)[0m step:364 - global_seqlen/min:506826 - global_seqlen/max:773716 - global_seqlen/minmax_diff:266890 - global_seqlen/balanced_min:649936 - global_seqlen/balanced_max:650012 - global_seqlen/mean:649963.5 - frontier/skipped_zero_acc_count:27.0 - actor/entropy:np.float64(0.16619879790746114) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.001536200288683176 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.06051101069897413) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012930347615574469) - actor/ppo_kl:np.float64(-6.741378237451249e-05) - actor/pg_clipfrac_lower:np.float64(1.9524873862917278e-05) - actor/grad_norm:np.float64(0.5926497051349053) - perf/mfu/actor:np.float64(0.2816631710588833) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(105.14440155029297) - actor/lr:np.float64(1e-06) - training/global_step:364 - training/epoch:0 - critic/score/mean:0.5977723002433777 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6718999147415161 - critic/rewards/max:1.7039611339569092 - critic/rewards/min:-0.11181900650262833 - critic/advantages/mean:-0.04315334931015968 - critic/advantages/max:2.474040985107422 - critic/advantages/min:-2.474672317504883 - critic/returns/mean:-0.04315334931015968 - critic/returns/max:2.474040985107422 - critic/returns/min:-2.474672317504883 - response_length/mean:2113.388671875 - response_length/max:8192.0 - response_length/min:169.0 - response_length/clip_ratio:0.12747524678707123 - response_length_non_aborted/mean:2113.388671875 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:169.0 - response_length_non_aborted/clip_ratio:0.12747524678707123 - response/aborted_ratio:0.0 - prompt_length/mean:254.3663330078125 - prompt_length/max:817.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.168630301952362e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.1816843086853623) - timing_s/agent_loop/generate_sequences/max:np.float64(45.95518186874688) - timing_s/agent_loop/generate_sequences/mean:np.float64(12.800684166876636) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(45.95518186874688) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.84126410074532 - timing_s/reward:0.0001664329320192337 - timing_s/old_log_prob:12.62713434919715 - timing_s/ref:36.47126497980207 - timing_s/adv:0.09377333335578442 - timing_s/update_actor:28.610493203625083 - timing_s/update_weights:38.84433155041188 - timing_s/step:164.90087749250233 - timing_s/stop_profile:6.026215851306915e-05 - timing_per_token_ms/adv:4.9015252027699096e-05 - timing_per_token_ms/update_actor:0.014954683648621216 - timing_per_token_ms/gen:0.02801637374444713 - timing_per_token_ms/ref:0.019063503245336254 - perf/total_num_tokens:2599854 - perf/time_per_step:164.90087749250233 - perf/throughput:3941.5405780939664 - frontier/active_count:9.0 - frontier/completed_count:55.0 - frontier/blacklisted_count:2204.0 - frontier/mean_score:2.4275675771351235 - frontier/mean_frontier_pct:0.8342612191518908 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:6.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:40.0 - frontier/replay_pool_size:5159.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:176.0 - frontier/cluster_11/score:2.218499015541462 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:176.0 - frontier/cluster_12/score:2.9997263125265983 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:208.0 - frontier/cluster_25/score:2.650607059354227 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:224.0 - frontier/cluster_30/score:1.7203597503381307 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:176.0 - frontier/cluster_39/score:2.915715532835141 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:224.0 - frontier/cluster_45/score:1.8001589835525613 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:208.0 - frontier/cluster_48/score:2.4895286837046724 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:256.0 - frontier/cluster_53/score:2.084592995068897 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.9689198612944225 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:364.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.10154192737514771 - cluster/prob_snapshot/cluster_12:0.13729913298949706 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.12131975161382288 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.07874181760018768 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.13345391312219076 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.08239427265510892 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.11394710524015667 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.09541297473164084 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.13588910467224755 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  46%|████▌     | 365/800 [12:50:22<19:01:35, 157.46s/it]
[36m(TaskRunner pid=2823680)[0m step:365 - global_seqlen/min:558906 - global_seqlen/max:764342 - global_seqlen/minmax_diff:205436 - global_seqlen/balanced_min:666601 - global_seqlen/balanced_max:666680 - global_seqlen/mean:666636.25 - frontier/skipped_zero_acc_count:25.0 - actor/entropy:np.float64(0.13564922637306154) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:-0.0006108313682489097 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.0022388823417713866) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0013052743079942258) - actor/ppo_kl:np.float64(0.0001610744592251369) - actor/pg_clipfrac_lower:np.float64(1.5847383268080124e-05) - actor/grad_norm:np.float64(0.7343077201109666) - perf/mfu/actor:np.float64(0.26744331204391586) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(105.27114486694336) - actor/lr:np.float64(1e-06) - training/global_step:365 - training/epoch:0 - critic/score/mean:0.5764563083648682 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.657187283039093 - critic/rewards/max:1.7311546802520752 - critic/rewards/min:-0.09158967435359955 - critic/advantages/mean:-0.10744007676839828 - critic/advantages/max:2.4740564823150635 - critic/advantages/min:-2.474358558654785 - critic/returns/mean:-0.10744007676839828 - critic/returns/max:2.4740564823150635 - critic/returns/min:-2.474358558654785 - response_length/mean:2144.411376953125 - response_length/max:8192.0 - response_length/min:224.0 - response_length/clip_ratio:0.12742719054222107 - response_length_non_aborted/mean:2144.411376953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:224.0 - response_length_non_aborted/clip_ratio:0.12742719054222107 - response/aborted_ratio:0.0 - prompt_length/mean:229.50485229492188 - prompt_length/max:352.0 - prompt_length/min:176.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.00010627973824739456 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.7705290131270885) - timing_s/agent_loop/generate_sequences/max:np.float64(47.13582342956215) - timing_s/agent_loop/generate_sequences/mean:np.float64(13.27564610923946) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(47.13582342956215) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:186 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:49.5780657408759 - timing_s/reward:0.00021128449589014053 - timing_s/old_log_prob:12.975070831365883 - timing_s/ref:25.081674153916538 - timing_s/adv:0.09525792207568884 - timing_s/update_actor:30.955653727054596 - timing_s/update_weights:29.658686607144773 - timing_s/step:148.81579696759582 - timing_s/stop_profile:6.158091127872467e-05 - timing_per_token_ms/adv:4.869770522557756e-05 - timing_per_token_ms/update_actor:0.015825133148163466 - timing_per_token_ms/gen:0.028057841556357488 - timing_per_token_ms/ref:0.01282224037535602 - perf/total_num_tokens:2666545 - perf/time_per_step:148.81579696759582 - perf/throughput:4479.60675938965 - frontier/active_count:7.0 - frontier/completed_count:57.0 - frontier/blacklisted_count:2227.0 - frontier/mean_score:2.4553427171536044 - frontier/mean_frontier_pct:0.8482863241472004 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:4.0 - frontier/batch_hard_count:3.0 - frontier/force_completed_count:9.0 - frontier/replay_slots_count:56.0 - frontier/replay_pool_size:5471.0 - frontier/cluster_0/frontier:120.0 - frontier/cluster_0/score:0.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:138.0 - frontier/cluster_1/score:0.0 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:197.0 - frontier/cluster_2/score:0.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:147.0 - frontier/cluster_3/score:0.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:62.0 - frontier/cluster_4/score:0.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:238.0 - frontier/cluster_5/score:0.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:319.0 - frontier/cluster_6/score:0.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:139.0 - frontier/cluster_7/score:0.0 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:241.0 - frontier/cluster_8/score:0.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:175.0 - frontier/cluster_9/score:0.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:220.0 - frontier/cluster_10/score:0.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:190.0 - frontier/cluster_11/score:0.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:192.0 - frontier/cluster_12/score:2.9998084187686187 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:82.0 - frontier/cluster_13/score:0.0 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:208.0 - frontier/cluster_14/score:0.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:161.0 - frontier/cluster_15/score:0.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:150.0 - frontier/cluster_16/score:0.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:193.0 - frontier/cluster_17/score:0.0 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:216.0 - frontier/cluster_18/score:0.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:127.0 - frontier/cluster_19/score:0.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:136.0 - frontier/cluster_20/score:0.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:183.0 - frontier/cluster_21/score:0.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:233.0 - frontier/cluster_22/score:0.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:227.0 - frontier/cluster_23/score:0.0 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:97.0 - frontier/cluster_24/score:0.0 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:224.0 - frontier/cluster_25/score:2.1554249415479587 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:213.0 - frontier/cluster_26/score:0.0 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:213.0 - frontier/cluster_27/score:0.0 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:126.0 - frontier/cluster_28/score:0.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:48.0 - frontier/cluster_29/score:0.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:224.0 - frontier/cluster_30/score:1.7203597503381307 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:121.0 - frontier/cluster_31/score:0.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:270.0 - frontier/cluster_32/score:0.0 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:171.0 - frontier/cluster_33/score:0.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:121.0 - frontier/cluster_34/score:0.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:184.0 - frontier/cluster_35/score:0.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:108.0 - frontier/cluster_36/score:0.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:193.0 - frontier/cluster_37/score:0.0 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:164.0 - frontier/cluster_38/score:0.0 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:192.0 - frontier/cluster_39/score:2.9410008729845987 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:102.0 - frontier/cluster_40/score:0.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:190.0 - frontier/cluster_41/score:0.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:392.0 - frontier/cluster_42/score:0.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:209.0 - frontier/cluster_43/score:0.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:197.0 - frontier/cluster_44/score:0.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:240.0 - frontier/cluster_45/score:0.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:191.0 - frontier/cluster_46/score:0.0 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:155.0 - frontier/cluster_47/score:0.0 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:224.0 - frontier/cluster_48/score:2.6426700785932704 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:219.0 - frontier/cluster_49/score:0.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:228.0 - frontier/cluster_50/score:0.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:216.0 - frontier/cluster_51/score:0.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:91.0 - frontier/cluster_52/score:0.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:272.0 - frontier/cluster_53/score:1.7592150965482278 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:122.0 - frontier/cluster_54/score:0.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:99.0 - frontier/cluster_55/score:0.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:133.0 - frontier/cluster_56/score:0.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:208.0 - frontier/cluster_57/score:2.9689198612944225 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:105.0 - frontier/cluster_58/score:0.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:174.0 - frontier/cluster_59/score:0.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:38.0 - frontier/cluster_60/score:0.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:54.0 - frontier/cluster_61/score:0.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:70.0 - frontier/cluster_62/score:0.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:261.0 - frontier/cluster_63/score:0.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:365.0 - cluster/prob_snapshot/cluster_0:0.0 - cluster/prob_snapshot/cluster_1:0.0 - cluster/prob_snapshot/cluster_2:0.0 - cluster/prob_snapshot/cluster_3:0.0 - cluster/prob_snapshot/cluster_4:0.0 - cluster/prob_snapshot/cluster_5:0.0 - cluster/prob_snapshot/cluster_6:0.0 - cluster/prob_snapshot/cluster_7:0.0 - cluster/prob_snapshot/cluster_8:0.0 - cluster/prob_snapshot/cluster_9:0.0 - cluster/prob_snapshot/cluster_10:0.0 - cluster/prob_snapshot/cluster_11:0.0 - cluster/prob_snapshot/cluster_12:0.1745353334303103 - cluster/prob_snapshot/cluster_13:0.0 - cluster/prob_snapshot/cluster_14:0.0 - cluster/prob_snapshot/cluster_15:0.0 - cluster/prob_snapshot/cluster_16:0.0 - cluster/prob_snapshot/cluster_17:0.0 - cluster/prob_snapshot/cluster_18:0.0 - cluster/prob_snapshot/cluster_19:0.0 - cluster/prob_snapshot/cluster_20:0.0 - cluster/prob_snapshot/cluster_21:0.0 - cluster/prob_snapshot/cluster_22:0.0 - cluster/prob_snapshot/cluster_23:0.0 - cluster/prob_snapshot/cluster_24:0.0 - cluster/prob_snapshot/cluster_25:0.12540727884599517 - cluster/prob_snapshot/cluster_26:0.0 - cluster/prob_snapshot/cluster_27:0.0 - cluster/prob_snapshot/cluster_28:0.0 - cluster/prob_snapshot/cluster_29:0.0 - cluster/prob_snapshot/cluster_30:0.10009424627476884 - cluster/prob_snapshot/cluster_31:0.0 - cluster/prob_snapshot/cluster_32:0.0 - cluster/prob_snapshot/cluster_33:0.0 - cluster/prob_snapshot/cluster_34:0.0 - cluster/prob_snapshot/cluster_35:0.0 - cluster/prob_snapshot/cluster_36:0.0 - cluster/prob_snapshot/cluster_37:0.0 - cluster/prob_snapshot/cluster_38:0.0 - cluster/prob_snapshot/cluster_39:0.17111378339150968 - cluster/prob_snapshot/cluster_40:0.0 - cluster/prob_snapshot/cluster_41:0.0 - cluster/prob_snapshot/cluster_42:0.0 - cluster/prob_snapshot/cluster_43:0.0 - cluster/prob_snapshot/cluster_44:0.0 - cluster/prob_snapshot/cluster_45:0.0 - cluster/prob_snapshot/cluster_46:0.0 - cluster/prob_snapshot/cluster_47:0.0 - cluster/prob_snapshot/cluster_48:0.15375625337531165 - cluster/prob_snapshot/cluster_49:0.0 - cluster/prob_snapshot/cluster_50:0.0 - cluster/prob_snapshot/cluster_51:0.0 - cluster/prob_snapshot/cluster_52:0.0 - cluster/prob_snapshot/cluster_53:0.10235493424534038 - cluster/prob_snapshot/cluster_54:0.0 - cluster/prob_snapshot/cluster_55:0.0 - cluster/prob_snapshot/cluster_56:0.0 - cluster/prob_snapshot/cluster_57:0.17273817043676382 - cluster/prob_snapshot/cluster_58:0.0 - cluster/prob_snapshot/cluster_59:0.0 - cluster/prob_snapshot/cluster_60:0.0 - cluster/prob_snapshot/cluster_61:0.0 - cluster/prob_snapshot/cluster_62:0.0 - cluster/prob_snapshot/cluster_63:0.0
[36m(TaskRunner pid=2823680)[0m Training Progress:  46%|████▌     | 366/800 [12:52:38<18:13:15, 151.14s/it]
[36m(TaskRunner pid=2823680)[0m step:366 - global_seqlen/min:480312 - global_seqlen/max:669493 - global_seqlen/minmax_diff:189181 - global_seqlen/balanced_min:561681 - global_seqlen/balanced_max:561711 - global_seqlen/mean:561699.5 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.1398494670418306) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.004750716965645552 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.029203712751041166) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.001554819677739243) - actor/ppo_kl:np.float64(0.0005222020223123117) - actor/pg_clipfrac_lower:np.float64(4.3467503613870715e-05) - actor/grad_norm:np.float64(0.921455605671956) - perf/mfu/actor:np.float64(0.24882551709244174) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(105.20399856567383) - actor/lr:np.float64(1e-06) - training/global_step:366 - training/epoch:0 - critic/score/mean:0.5637019276618958 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6160438060760498 - critic/rewards/max:1.8860433101654053 - critic/rewards/min:-0.08912593871355057 - critic/advantages/mean:-0.06047852709889412 - critic/advantages/max:2.4739322662353516 - critic/advantages/min:-2.474323272705078 - critic/returns/mean:-0.06047852709889412 - critic/returns/max:2.4739322662353516 - critic/returns/min:-2.474323272705078 - response_length/mean:1826.79443359375 - response_length/max:8192.0 - response_length/min:259.0 - response_length/clip_ratio:0.09975961595773697 - response_length_non_aborted/mean:1826.79443359375 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:259.0 - response_length_non_aborted/clip_ratio:0.09975961595773697 - response/aborted_ratio:0.0 - prompt_length/mean:236.875 - prompt_length/max:479.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.605606853961945e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.8340892624109983) - timing_s/agent_loop/generate_sequences/max:np.float64(44.83186776470393) - timing_s/agent_loop/generate_sequences/mean:np.float64(10.505277828445287) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(44.83186776470393) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:215 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:47.39862515963614 - timing_s/reward:0.0001449519768357277 - timing_s/old_log_prob:11.861839367076755 - timing_s/ref:21.150313480757177 - timing_s/adv:0.1320121567696333 - timing_s/update_actor:27.803533567115664 - timing_s/update_weights:25.515376737341285 - timing_s/step:134.3115424970165 - timing_s/stop_profile:4.9532391130924225e-05 - timing_per_token_ms/adv:7.688656535055199e-05 - timing_per_token_ms/update_actor:0.01619334349877119 - timing_per_token_ms/gen:0.031185501321235206 - timing_per_token_ms/ref:0.012318372787898922 - perf/total_num_tokens:2246798 - perf/time_per_step:134.3115424970165 - perf/throughput:4182.064248219599 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:24.0 - frontier/mean_score:2.065625 - frontier/mean_frontier_pct:0.004808039747064138 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:13.0 - frontier/batch_hard_count:2.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:0.0 - frontier/cluster_1/score:2.3 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.0 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.0 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.0 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.0 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:0.0 - frontier/cluster_12/score:2.0 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.9 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.0 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:0.0 - frontier/cluster_23/score:2.3 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:0.0 - frontier/cluster_26/score:2.3 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:0.0 - frontier/cluster_29/score:2.0 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.0 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.3 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:0.0 - frontier/cluster_45/score:2.0 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:0.0 - frontier/cluster_47/score:2.3 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:0.0 - frontier/cluster_52/score:2.0 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.0 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.0 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:366.0 - cluster/prob_snapshot/cluster_0:0.015128593040847203 - cluster/prob_snapshot/cluster_1:0.01739788199697428 - cluster/prob_snapshot/cluster_2:0.015128593040847203 - cluster/prob_snapshot/cluster_3:0.015128593040847203 - cluster/prob_snapshot/cluster_4:0.015128593040847203 - cluster/prob_snapshot/cluster_5:0.015128593040847203 - cluster/prob_snapshot/cluster_6:0.015128593040847203 - cluster/prob_snapshot/cluster_7:0.01739788199697428 - cluster/prob_snapshot/cluster_8:0.015128593040847203 - cluster/prob_snapshot/cluster_9:0.015128593040847203 - cluster/prob_snapshot/cluster_10:0.015128593040847203 - cluster/prob_snapshot/cluster_11:0.015128593040847203 - cluster/prob_snapshot/cluster_12:0.015128593040847203 - cluster/prob_snapshot/cluster_13:0.021936459909228444 - cluster/prob_snapshot/cluster_14:0.015128593040847203 - cluster/prob_snapshot/cluster_15:0.015128593040847203 - cluster/prob_snapshot/cluster_16:0.015128593040847203 - cluster/prob_snapshot/cluster_17:0.01739788199697428 - cluster/prob_snapshot/cluster_18:0.015128593040847203 - cluster/prob_snapshot/cluster_19:0.015128593040847203 - cluster/prob_snapshot/cluster_20:0.015128593040847203 - cluster/prob_snapshot/cluster_21:0.015128593040847203 - cluster/prob_snapshot/cluster_22:0.015128593040847203 - cluster/prob_snapshot/cluster_23:0.01739788199697428 - cluster/prob_snapshot/cluster_24:0.01739788199697428 - cluster/prob_snapshot/cluster_25:0.015128593040847203 - cluster/prob_snapshot/cluster_26:0.01739788199697428 - cluster/prob_snapshot/cluster_27:0.01739788199697428 - cluster/prob_snapshot/cluster_28:0.015128593040847203 - cluster/prob_snapshot/cluster_29:0.015128593040847203 - cluster/prob_snapshot/cluster_30:0.015128593040847203 - cluster/prob_snapshot/cluster_31:0.015128593040847203 - cluster/prob_snapshot/cluster_32:0.012859304084720122 - cluster/prob_snapshot/cluster_33:0.015128593040847203 - cluster/prob_snapshot/cluster_34:0.015128593040847203 - cluster/prob_snapshot/cluster_35:0.015128593040847203 - cluster/prob_snapshot/cluster_36:0.015128593040847203 - cluster/prob_snapshot/cluster_37:0.01739788199697428 - cluster/prob_snapshot/cluster_38:0.01739788199697428 - cluster/prob_snapshot/cluster_39:0.01739788199697428 - cluster/prob_snapshot/cluster_40:0.015128593040847203 - cluster/prob_snapshot/cluster_41:0.015128593040847203 - cluster/prob_snapshot/cluster_42:0.015128593040847203 - cluster/prob_snapshot/cluster_43:0.015128593040847203 - cluster/prob_snapshot/cluster_44:0.015128593040847203 - cluster/prob_snapshot/cluster_45:0.015128593040847203 - cluster/prob_snapshot/cluster_46:0.01739788199697428 - cluster/prob_snapshot/cluster_47:0.01739788199697428 - cluster/prob_snapshot/cluster_48:0.015128593040847203 - cluster/prob_snapshot/cluster_49:0.015128593040847203 - cluster/prob_snapshot/cluster_50:0.015128593040847203 - cluster/prob_snapshot/cluster_51:0.015128593040847203 - cluster/prob_snapshot/cluster_52:0.015128593040847203 - cluster/prob_snapshot/cluster_53:0.012859304084720122 - cluster/prob_snapshot/cluster_54:0.015128593040847203 - cluster/prob_snapshot/cluster_55:0.015128593040847203 - cluster/prob_snapshot/cluster_56:0.015128593040847203 - cluster/prob_snapshot/cluster_57:0.01739788199697428 - cluster/prob_snapshot/cluster_58:0.015128593040847203 - cluster/prob_snapshot/cluster_59:0.015128593040847203 - cluster/prob_snapshot/cluster_60:0.015128593040847203 - cluster/prob_snapshot/cluster_61:0.015128593040847203 - cluster/prob_snapshot/cluster_62:0.015128593040847203 - cluster/prob_snapshot/cluster_63:0.015128593040847203
[36m(TaskRunner pid=2823680)[0m Training Progress:  46%|████▌     | 367/800 [12:55:08<18:07:25, 150.68s/it]
[36m(TaskRunner pid=2823680)[0m step:367 - global_seqlen/min:571413 - global_seqlen/max:723725 - global_seqlen/minmax_diff:152312 - global_seqlen/balanced_min:621833 - global_seqlen/balanced_max:621975 - global_seqlen/mean:621916.0 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.13431422132998705) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.007440539076924324 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.009436415013624355) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.00140856424631137) - actor/ppo_kl:np.float64(-6.792219823207941e-05) - actor/pg_clipfrac_lower:np.float64(5.185694915578804e-05) - actor/grad_norm:np.float64(0.551863451798757) - perf/mfu/actor:np.float64(0.2978936117721189) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(104.9276237487793) - actor/lr:np.float64(1e-06) - training/global_step:367 - training/epoch:0 - critic/score/mean:0.5519663095474243 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.5925211906433105 - critic/rewards/max:1.5709824562072754 - critic/rewards/min:-0.09461989253759384 - critic/advantages/mean:-0.06544723361730576 - critic/advantages/max:2.4745867252349854 - critic/advantages/min:-2.4747912883758545 - critic/returns/mean:-0.06544723361730576 - critic/returns/max:2.4745867252349854 - critic/returns/min:-2.4747912883758545 - response_length/mean:1763.3834228515625 - response_length/max:8192.0 - response_length/min:211.0 - response_length/clip_ratio:0.08567415922880173 - response_length_non_aborted/mean:1763.3834228515625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:211.0 - response_length_non_aborted/clip_ratio:0.08567415922880173 - response/aborted_ratio:0.0 - prompt_length/mean:249.30337524414062 - prompt_length/max:816.0 - prompt_length/min:173.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:0.0001102350652217865 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.61425487883389) - timing_s/agent_loop/generate_sequences/max:np.float64(43.320264647714794) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.797006852631966) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(43.320264647714794) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:45.37710242345929 - timing_s/reward:0.00020631123334169388 - timing_s/old_log_prob:11.875510282814503 - timing_s/ref:29.21866632066667 - timing_s/adv:0.07114850357174873 - timing_s/update_actor:25.699886585585773 - timing_s/update_weights:33.179052823223174 - timing_s/step:145.90520729869604 - timing_s/stop_profile:5.14984130859375e-05 - timing_per_token_ms/adv:4.964889403925013e-05 - timing_per_token_ms/update_actor:0.017933911211804456 - timing_per_token_ms/gen:0.03614181944300712 - timing_per_token_ms/ref:0.020389388325786406 - perf/total_num_tokens:2487664 - perf/time_per_step:145.90520729869604 - perf/throughput:4262.4661005197595 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:63.0 - frontier/mean_score:2.07875 - frontier/mean_frontier_pct:0.020508836493369792 - frontier/batch_easy_count:0.0 - frontier/batch_medium_count:10.0 - frontier/batch_hard_count:6.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.91 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.3 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:0.0 - frontier/cluster_5/score:2.3 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:0.0 - frontier/cluster_9/score:2.3 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.9 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.0 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:0.0 - frontier/cluster_16/score:2.3 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.0 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:0.0 - frontier/cluster_22/score:2.0 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.51 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.0 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:0.0 - frontier/cluster_37/score:2.3 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:0.0 - frontier/cluster_44/score:2.0 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:16.0 - frontier/cluster_47/score:2.51 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.0 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:0.0 - frontier/cluster_56/score:2.3 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.0 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:0.0 - frontier/cluster_60/score:2.0 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:367.0 - cluster/prob_snapshot/cluster_0:0.01503307276007216 - cluster/prob_snapshot/cluster_1:0.014356584485868911 - cluster/prob_snapshot/cluster_2:0.01728803367408298 - cluster/prob_snapshot/cluster_3:0.01503307276007216 - cluster/prob_snapshot/cluster_4:0.01503307276007216 - cluster/prob_snapshot/cluster_5:0.01728803367408298 - cluster/prob_snapshot/cluster_6:0.01503307276007216 - cluster/prob_snapshot/cluster_7:0.01728803367408298 - cluster/prob_snapshot/cluster_8:0.01503307276007216 - cluster/prob_snapshot/cluster_9:0.01728803367408298 - cluster/prob_snapshot/cluster_10:0.01503307276007216 - cluster/prob_snapshot/cluster_11:0.01728803367408298 - cluster/prob_snapshot/cluster_12:0.012778111846061336 - cluster/prob_snapshot/cluster_13:0.02179795550210463 - cluster/prob_snapshot/cluster_14:0.01503307276007216 - cluster/prob_snapshot/cluster_15:0.01503307276007216 - cluster/prob_snapshot/cluster_16:0.01728803367408298 - cluster/prob_snapshot/cluster_17:0.01728803367408298 - cluster/prob_snapshot/cluster_18:0.01503307276007216 - cluster/prob_snapshot/cluster_19:0.01503307276007216 - cluster/prob_snapshot/cluster_20:0.01503307276007216 - cluster/prob_snapshot/cluster_21:0.01503307276007216 - cluster/prob_snapshot/cluster_22:0.01503307276007216 - cluster/prob_snapshot/cluster_23:0.018866506313890558 - cluster/prob_snapshot/cluster_24:0.01728803367408298 - cluster/prob_snapshot/cluster_25:0.01503307276007216 - cluster/prob_snapshot/cluster_26:0.014356584485868911 - cluster/prob_snapshot/cluster_27:0.01728803367408298 - cluster/prob_snapshot/cluster_28:0.01503307276007216 - cluster/prob_snapshot/cluster_29:0.012778111846061336 - cluster/prob_snapshot/cluster_30:0.01503307276007216 - cluster/prob_snapshot/cluster_31:0.01503307276007216 - cluster/prob_snapshot/cluster_32:0.012778111846061336 - cluster/prob_snapshot/cluster_33:0.01728803367408298 - cluster/prob_snapshot/cluster_34:0.01503307276007216 - cluster/prob_snapshot/cluster_35:0.01503307276007216 - cluster/prob_snapshot/cluster_36:0.01503307276007216 - cluster/prob_snapshot/cluster_37:0.01728803367408298 - cluster/prob_snapshot/cluster_38:0.01728803367408298 - cluster/prob_snapshot/cluster_39:0.01728803367408298 - cluster/prob_snapshot/cluster_40:0.01503307276007216 - cluster/prob_snapshot/cluster_41:0.01503307276007216 - cluster/prob_snapshot/cluster_42:0.01503307276007216 - cluster/prob_snapshot/cluster_43:0.01503307276007216 - cluster/prob_snapshot/cluster_44:0.01503307276007216 - cluster/prob_snapshot/cluster_45:0.012778111846061336 - cluster/prob_snapshot/cluster_46:0.01728803367408298 - cluster/prob_snapshot/cluster_47:0.018866506313890558 - cluster/prob_snapshot/cluster_48:0.01503307276007216 - cluster/prob_snapshot/cluster_49:0.01503307276007216 - cluster/prob_snapshot/cluster_50:0.01503307276007216 - cluster/prob_snapshot/cluster_51:0.01503307276007216 - cluster/prob_snapshot/cluster_52:0.012778111846061336 - cluster/prob_snapshot/cluster_53:0.012778111846061336 - cluster/prob_snapshot/cluster_54:0.01728803367408298 - cluster/prob_snapshot/cluster_55:0.01503307276007216 - cluster/prob_snapshot/cluster_56:0.01728803367408298 - cluster/prob_snapshot/cluster_57:0.01728803367408298 - cluster/prob_snapshot/cluster_58:0.01503307276007216 - cluster/prob_snapshot/cluster_59:0.01503307276007216 - cluster/prob_snapshot/cluster_60:0.01503307276007216 - cluster/prob_snapshot/cluster_61:0.01503307276007216 - cluster/prob_snapshot/cluster_62:0.01503307276007216 - cluster/prob_snapshot/cluster_63:0.01503307276007216
[36m(RewardLoopWorker pid=2826756)[0m WARNING:2026-04-13 00:27:31,075:WARNING: Error in configuration: macro '\frac' failed its substitution!
[36m(TaskRunner pid=2823680)[0m Training Progress:  46%|████▌     | 368/800 [12:57:43<18:16:10, 152.25s/it]
[36m(TaskRunner pid=2823680)[0m step:368 - global_seqlen/min:530399 - global_seqlen/max:701382 - global_seqlen/minmax_diff:170983 - global_seqlen/balanced_min:609811 - global_seqlen/balanced_max:609883 - global_seqlen/mean:609839.5 - frontier/skipped_zero_acc_count:39.0 - actor/entropy:np.float64(0.1492640661696593) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.005500006023794413 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.05128554880502634) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.001425060634371928) - actor/ppo_kl:np.float64(-0.00014107650643685094) - actor/pg_clipfrac_lower:np.float64(0.00010282207353561211) - actor/grad_norm:np.float64(0.642117653042078) - perf/mfu/actor:np.float64(0.2882512258766292) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(105.1246566772461) - actor/lr:np.float64(1e-06) - training/global_step:368 - training/epoch:0 - critic/score/mean:0.601123571395874 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6545151472091675 - critic/rewards/max:1.3643332719802856 - critic/rewards/min:-0.0938233956694603 - critic/advantages/mean:-0.13615544140338898 - critic/advantages/max:2.4742910861968994 - critic/advantages/min:-2.474625825881958 - critic/returns/mean:-0.13615544140338898 - critic/returns/max:2.4742910861968994 - critic/returns/min:-2.474625825881958 - response_length/mean:1886.3961181640625 - response_length/max:8192.0 - response_length/min:185.0 - response_length/clip_ratio:0.1081460639834404 - response_length_non_aborted/mean:1886.3961181640625 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:185.0 - response_length_non_aborted/clip_ratio:0.1081460639834404 - response/aborted_ratio:0.0 - prompt_length/mean:236.1011199951172 - prompt_length/max:508.0 - prompt_length/min:180.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.285976946353912e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.4222626751288772) - timing_s/agent_loop/generate_sequences/max:np.float64(46.19481075555086) - timing_s/agent_loop/generate_sequences/mean:np.float64(11.571586469890462) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(46.19481075555086) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:190 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:48.00042460486293 - timing_s/reward:0.0001275818794965744 - timing_s/old_log_prob:12.368189758621156 - timing_s/ref:31.05938438139856 - timing_s/adv:0.08291163016110659 - timing_s/update_actor:26.101859837770462 - timing_s/update_weights:35.195146354846656 - timing_s/step:153.7970762439072 - timing_s/stop_profile:4.82192263007164e-05 - timing_per_token_ms/adv:5.486410971885366e-05 - timing_per_token_ms/update_actor:0.017272067853724916 - timing_per_token_ms/gen:0.0357381611723673 - timing_per_token_ms/ref:0.020552550579333067 - perf/total_num_tokens:2439358 - perf/time_per_step:153.7970762439072 - perf/throughput:3965.2216732186366 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:102.0 - frontier/mean_score:2.10959375 - frontier/mean_frontier_pct:0.03836337829297317 - frontier/batch_easy_count:1.0 - frontier/batch_medium_count:11.0 - frontier/batch_hard_count:4.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:0.0 - frontier/cluster_0/score:2.0 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.91 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.3 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.91 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:2.51 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:16.0 - frontier/cluster_13/score:2.9299999999999997 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.0 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:0.0 - frontier/cluster_18/score:2.3 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.0 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:0.0 - frontier/cluster_33/score:2.3 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:0.0 - frontier/cluster_39/score:2.3 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:0.0 - frontier/cluster_40/score:2.0 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:1.7 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:3.2569999999999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:0.0 - frontier/cluster_49/score:2.3 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.0 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:0.0 - frontier/cluster_54/score:2.3 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.0 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.51 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.0 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.0 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:368.0 - cluster/prob_snapshot/cluster_0:0.014813278622957619 - cluster/prob_snapshot/cluster_1:0.014146681084924525 - cluster/prob_snapshot/cluster_2:0.01703527041640126 - cluster/prob_snapshot/cluster_3:0.014813278622957619 - cluster/prob_snapshot/cluster_4:0.014813278622957619 - cluster/prob_snapshot/cluster_5:0.014146681084924525 - cluster/prob_snapshot/cluster_6:0.014813278622957619 - cluster/prob_snapshot/cluster_7:0.01703527041640126 - cluster/prob_snapshot/cluster_8:0.014813278622957619 - cluster/prob_snapshot/cluster_9:0.01859066467181181 - cluster/prob_snapshot/cluster_10:0.014813278622957619 - cluster/prob_snapshot/cluster_11:0.01703527041640126 - cluster/prob_snapshot/cluster_12:0.012591286829513975 - cluster/prob_snapshot/cluster_13:0.021701453182632908 - cluster/prob_snapshot/cluster_14:0.01703527041640126 - cluster/prob_snapshot/cluster_15:0.014813278622957619 - cluster/prob_snapshot/cluster_16:0.01859066467181181 - cluster/prob_snapshot/cluster_17:0.01703527041640126 - cluster/prob_snapshot/cluster_18:0.01703527041640126 - cluster/prob_snapshot/cluster_19:0.014813278622957619 - cluster/prob_snapshot/cluster_20:0.014813278622957619 - cluster/prob_snapshot/cluster_21:0.014813278622957619 - cluster/prob_snapshot/cluster_22:0.012591286829513975 - cluster/prob_snapshot/cluster_23:0.019679440650599192 - cluster/prob_snapshot/cluster_24:0.01703527041640126 - cluster/prob_snapshot/cluster_25:0.014813278622957619 - cluster/prob_snapshot/cluster_26:0.014146681084924525 - cluster/prob_snapshot/cluster_27:0.01703527041640126 - cluster/prob_snapshot/cluster_28:0.01703527041640126 - cluster/prob_snapshot/cluster_29:0.012591286829513975 - cluster/prob_snapshot/cluster_30:0.014813278622957619 - cluster/prob_snapshot/cluster_31:0.014813278622957619 - cluster/prob_snapshot/cluster_32:0.012591286829513975 - cluster/prob_snapshot/cluster_33:0.01703527041640126 - cluster/prob_snapshot/cluster_34:0.014813278622957619 - cluster/prob_snapshot/cluster_35:0.014813278622957619 - cluster/prob_snapshot/cluster_36:0.014813278622957619 - cluster/prob_snapshot/cluster_37:0.01859066467181181 - cluster/prob_snapshot/cluster_38:0.01703527041640126 - cluster/prob_snapshot/cluster_39:0.01703527041640126 - cluster/prob_snapshot/cluster_40:0.014813278622957619 - cluster/prob_snapshot/cluster_41:0.014813278622957619 - cluster/prob_snapshot/cluster_42:0.014813278622957619 - cluster/prob_snapshot/cluster_43:0.014813278622957619 - cluster/prob_snapshot/cluster_44:0.012591286829513975 - cluster/prob_snapshot/cluster_45:0.012591286829513975 - cluster/prob_snapshot/cluster_46:0.01703527041640126 - cluster/prob_snapshot/cluster_47:0.02412342423748648 - cluster/prob_snapshot/cluster_48:0.014813278622957619 - cluster/prob_snapshot/cluster_49:0.01703527041640126 - cluster/prob_snapshot/cluster_50:0.014813278622957619 - cluster/prob_snapshot/cluster_51:0.014813278622957619 - cluster/prob_snapshot/cluster_52:0.012591286829513975 - cluster/prob_snapshot/cluster_53:0.012591286829513975 - cluster/prob_snapshot/cluster_54:0.01703527041640126 - cluster/prob_snapshot/cluster_55:0.014813278622957619 - cluster/prob_snapshot/cluster_56:0.01859066467181181 - cluster/prob_snapshot/cluster_57:0.01703527041640126 - cluster/prob_snapshot/cluster_58:0.01703527041640126 - cluster/prob_snapshot/cluster_59:0.014813278622957619 - cluster/prob_snapshot/cluster_60:0.012591286829513975 - cluster/prob_snapshot/cluster_61:0.014813278622957619 - cluster/prob_snapshot/cluster_62:0.014813278622957619 - cluster/prob_snapshot/cluster_63:0.014813278622957619
[36m(TaskRunner pid=2823680)[0m Training Progress:  46%|████▌     | 369/800 [13:00:11<18:03:44, 150.87s/it]
[36m(TaskRunner pid=2823680)[0m step:369 - global_seqlen/min:465147 - global_seqlen/max:612300 - global_seqlen/minmax_diff:147153 - global_seqlen/balanced_min:538118 - global_seqlen/balanced_max:538248 - global_seqlen/mean:538164.5 - frontier/skipped_zero_acc_count:31.0 - actor/entropy:np.float64(0.13285834166430394) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.008000782690942287 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(-0.03754214862419758) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0012631680305907028) - actor/ppo_kl:np.float64(-0.00015751302814063522) - actor/pg_clipfrac_lower:np.float64(8.049760833032173e-05) - actor/grad_norm:np.float64(0.6140936750632066) - perf/mfu/actor:np.float64(0.24154798853542506) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(105.54255294799805) - actor/lr:np.float64(1e-06) - training/global_step:369 - training/epoch:0 - critic/score/mean:0.6507731676101685 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.6898838877677917 - critic/rewards/max:1.482946515083313 - critic/rewards/min:-0.0935933068394661 - critic/advantages/mean:-0.16880345344543457 - critic/advantages/max:2.474412441253662 - critic/advantages/min:-2.4746999740600586 - critic/returns/mean:-0.16880345344543457 - critic/returns/max:2.474412441253662 - critic/returns/min:-2.4746999740600586 - response_length/mean:1680.33251953125 - response_length/max:8192.0 - response_length/min:213.0 - response_length/clip_ratio:0.08634020388126373 - response_length_non_aborted/mean:1680.33251953125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:213.0 - response_length_non_aborted/clip_ratio:0.08634020388126373 - response/aborted_ratio:0.0 - prompt_length/mean:228.67010498046875 - prompt_length/max:390.0 - prompt_length/min:178.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.282158523797989e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(1.6434627324342728) - timing_s/agent_loop/generate_sequences/max:np.float64(39.688985913060606) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.887153924171798) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(39.688985913060606) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:187 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:41.90003755129874 - timing_s/reward:0.00022458843886852264 - timing_s/old_log_prob:12.243205239064991 - timing_s/ref:27.48576459567994 - timing_s/adv:0.0934488121420145 - timing_s/update_actor:27.299666597507894 - timing_s/update_weights:35.83951424714178 - timing_s/step:145.279409494251 - timing_s/stop_profile:6.39054924249649e-05 - timing_per_token_ms/adv:6.308201383165124e-05 - timing_per_token_ms/update_actor:0.018428462667736765 - timing_per_token_ms/gen:0.03213345845530902 - timing_per_token_ms/ref:0.0185540869129855 - perf/total_num_tokens:2152658 - perf/time_per_step:145.279409494251 - perf/throughput:3704.341185536662 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:133.0 - frontier/mean_score:2.1909687499999997 - frontier/mean_frontier_pct:0.05277286121965522 - frontier/batch_easy_count:2.0 - frontier/batch_medium_count:14.0 - frontier/batch_hard_count:0.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.9 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.91 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.3 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.91 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:2.51 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:0.0 - frontier/cluster_10/score:2.0 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:0.0 - frontier/cluster_11/score:2.3 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.9509999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:0.0 - frontier/cluster_15/score:2.3 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:16.0 - frontier/cluster_23/score:2.6569999999999996 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:0.0 - frontier/cluster_24/score:2.3 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:0.0 - frontier/cluster_27/score:2.3 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.51 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:0.0 - frontier/cluster_35/score:2.0 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.51 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.0 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:0.0 - frontier/cluster_43/score:2.0 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:16.0 - frontier/cluster_45/score:2.09 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:3.2569999999999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.51 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:0.0 - frontier/cluster_50/score:2.0 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.51 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:0.0 - frontier/cluster_55/score:2.3 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.3 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.0 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:369.0 - cluster/prob_snapshot/cluster_0:0.020681490778907734 - cluster/prob_snapshot/cluster_1:0.013621257719901301 - cluster/prob_snapshot/cluster_2:0.016402561652237167 - cluster/prob_snapshot/cluster_3:0.014263097088901886 - cluster/prob_snapshot/cluster_4:0.014263097088901886 - cluster/prob_snapshot/cluster_5:0.013621257719901301 - cluster/prob_snapshot/cluster_6:0.014263097088901886 - cluster/prob_snapshot/cluster_7:0.016402561652237167 - cluster/prob_snapshot/cluster_8:0.014263097088901886 - cluster/prob_snapshot/cluster_9:0.017900186846571867 - cluster/prob_snapshot/cluster_10:0.014263097088901886 - cluster/prob_snapshot/cluster_11:0.016402561652237167 - cluster/prob_snapshot/cluster_12:0.012123632525566603 - cluster/prob_snapshot/cluster_13:0.02104519975467473 - cluster/prob_snapshot/cluster_14:0.016402561652237167 - cluster/prob_snapshot/cluster_15:0.016402561652237167 - cluster/prob_snapshot/cluster_16:0.017900186846571867 - cluster/prob_snapshot/cluster_17:0.016402561652237167 - cluster/prob_snapshot/cluster_18:0.017900186846571867 - cluster/prob_snapshot/cluster_19:0.016402561652237167 - cluster/prob_snapshot/cluster_20:0.014263097088901886 - cluster/prob_snapshot/cluster_21:0.014263097088901886 - cluster/prob_snapshot/cluster_22:0.012123632525566603 - cluster/prob_snapshot/cluster_23:0.018948524482606154 - cluster/prob_snapshot/cluster_24:0.016402561652237167 - cluster/prob_snapshot/cluster_25:0.014263097088901886 - cluster/prob_snapshot/cluster_26:0.013621257719901301 - cluster/prob_snapshot/cluster_27:0.016402561652237167 - cluster/prob_snapshot/cluster_28:0.016402561652237167 - cluster/prob_snapshot/cluster_29:0.012123632525566603 - cluster/prob_snapshot/cluster_30:0.014263097088901886 - cluster/prob_snapshot/cluster_31:0.014263097088901886 - cluster/prob_snapshot/cluster_32:0.012123632525566603 - cluster/prob_snapshot/cluster_33:0.017900186846571867 - cluster/prob_snapshot/cluster_34:0.014263097088901886 - cluster/prob_snapshot/cluster_35:0.014263097088901886 - cluster/prob_snapshot/cluster_36:0.014263097088901886 - cluster/prob_snapshot/cluster_37:0.017900186846571867 - cluster/prob_snapshot/cluster_38:0.016402561652237167 - cluster/prob_snapshot/cluster_39:0.017900186846571867 - cluster/prob_snapshot/cluster_40:0.020681490778907734 - cluster/prob_snapshot/cluster_41:0.014263097088901886 - cluster/prob_snapshot/cluster_42:0.014263097088901886 - cluster/prob_snapshot/cluster_43:0.014263097088901886 - cluster/prob_snapshot/cluster_44:0.012123632525566603 - cluster/prob_snapshot/cluster_45:0.01490493645790247 - cluster/prob_snapshot/cluster_46:0.016402561652237167 - cluster/prob_snapshot/cluster_47:0.023227453609276718 - cluster/prob_snapshot/cluster_48:0.014263097088901886 - cluster/prob_snapshot/cluster_49:0.017900186846571867 - cluster/prob_snapshot/cluster_50:0.014263097088901886 - cluster/prob_snapshot/cluster_51:0.016402561652237167 - cluster/prob_snapshot/cluster_52:0.012123632525566603 - cluster/prob_snapshot/cluster_53:0.012123632525566603 - cluster/prob_snapshot/cluster_54:0.017900186846571867 - cluster/prob_snapshot/cluster_55:0.016402561652237167 - cluster/prob_snapshot/cluster_56:0.018948524482606154 - cluster/prob_snapshot/cluster_57:0.016402561652237167 - cluster/prob_snapshot/cluster_58:0.016402561652237167 - cluster/prob_snapshot/cluster_59:0.016402561652237167 - cluster/prob_snapshot/cluster_60:0.012123632525566603 - cluster/prob_snapshot/cluster_61:0.016402561652237167 - cluster/prob_snapshot/cluster_62:0.014263097088901886 - cluster/prob_snapshot/cluster_63:0.014263097088901886
[36m(TaskRunner pid=2823680)[0m Training Progress:  46%|████▋     | 370/800 [13:02:40<17:57:26, 150.34s/it]
[36m(TaskRunner pid=2823680)[0m step:370 - global_seqlen/min:416522 - global_seqlen/max:565596 - global_seqlen/minmax_diff:149074 - global_seqlen/balanced_min:497546 - global_seqlen/balanced_max:497601 - global_seqlen/mean:497565.0 - frontier/skipped_zero_acc_count:24.0 - actor/entropy:np.float64(0.15261437833452454) - perf/mfu/actor_infer:0 - actor/reward_kl_penalty:0.01149726565927267 - actor/reward_kl_penalty_coeff:0.001 - actor/pg_loss:np.float64(0.027633350480755325) - actor/kl_loss:np.float64(0.0) - actor/pg_clipfrac:np.float64(0.0011261109863111274) - actor/ppo_kl:np.float64(0.00013649031322197165) - actor/pg_clipfrac_lower:np.float64(3.8341125096289594e-05) - actor/grad_norm:np.float64(0.8307613684580877) - perf/mfu/actor:np.float64(0.17920465156662074) - perf/max_memory_allocated_gb:np.float64(127.70515775680542) - perf/max_memory_reserved_gb:np.float64(134.326171875) - perf/cpu_memory_used_gb:np.float64(106.37765884399414) - actor/lr:np.float64(1e-06) - training/global_step:370 - training/epoch:0 - critic/score/mean:0.7536057829856873 - critic/score/max:1.0 - critic/score/min:0.0 - critic/rewards/mean:0.7856676578521729 - critic/rewards/max:1.8527369499206543 - critic/rewards/min:-0.08245779573917389 - critic/advantages/mean:-0.18217681348323822 - critic/advantages/max:2.474360942840576 - critic/advantages/min:-2.4748499393463135 - critic/returns/mean:-0.18217681348323822 - critic/returns/max:2.474360942840576 - critic/returns/min:-2.4748499393463135 - response_length/mean:1376.454345703125 - response_length/max:8192.0 - response_length/min:80.0 - response_length/clip_ratio:0.061298076063394547 - response_length_non_aborted/mean:1376.454345703125 - response_length_non_aborted/max:8192.0 - response_length_non_aborted/min:80.0 - response_length_non_aborted/clip_ratio:0.061298076063394547 - response/aborted_ratio:0.0 - prompt_length/mean:229.08653259277344 - prompt_length/max:507.0 - prompt_length/min:170.0 - prompt_length/clip_ratio:0.0 - num_turns/min:np.int32(2) - num_turns/max:np.int32(2) - num_turns/mean:np.float64(2.0) - timing_s/start_profile:8.004996925592422e-05 - timing_s/agent_loop/num_preempted/min:np.int64(-1) - timing_s/agent_loop/num_preempted/max:np.int64(-1) - timing_s/agent_loop/num_preempted/mean:np.float64(-1.0) - timing_s/agent_loop/generate_sequences/min:np.float64(0.700423888862133) - timing_s/agent_loop/generate_sequences/max:np.float64(39.03192979004234) - timing_s/agent_loop/generate_sequences/mean:np.float64(9.246425170083967) - timing_s/agent_loop/tool_calls/min:np.float64(0.0) - timing_s/agent_loop/tool_calls/max:np.float64(0.0) - timing_s/agent_loop/tool_calls/mean:np.float64(0.0) - timing_s/agent_loop/slowest/generate_sequences:np.float64(39.03192979004234) - timing_s/agent_loop/slowest/tool_calls:np.float64(0.0) - timing_s/agent_loop/slowest/prompt_length:191 - timing_s/agent_loop/slowest/response_length:8192 - timing_s/agent_loop/slowest/num_preempted:np.int64(-1) - timing_s/gen:41.17259958293289 - timing_s/reward:0.00014111213386058807 - timing_s/old_log_prob:11.463596215471625 - timing_s/ref:20.277799154631793 - timing_s/adv:0.08902443200349808 - timing_s/update_actor:33.89509891625494 - timing_s/update_weights:39.69559197220951 - timing_s/step:147.0064945332706 - timing_s/stop_profile:8.193496614694595e-05 - timing_per_token_ms/adv:6.664453178483323e-05 - timing_per_token_ms/update_actor:0.025374191626245452 - timing_per_token_ms/gen:0.03595200843769517 - timing_per_token_ms/ref:0.015180152233200675 - perf/total_num_tokens:1990260 - perf/time_per_step:147.0064945332706 - perf/throughput:3384.646383003105 - frontier/active_count:64.0 - frontier/completed_count:0.0 - frontier/blacklisted_count:157.0 - frontier/mean_score:2.302446875 - frontier/mean_frontier_pct:0.06884843777544582 - frontier/batch_easy_count:7.0 - frontier/batch_medium_count:8.0 - frontier/batch_hard_count:1.0 - frontier/force_completed_count:0.0 - frontier/replay_slots_count:0.0 - frontier/replay_pool_size:0.0 - frontier/cluster_0/frontier:16.0 - frontier/cluster_0/score:2.9 - frontier/cluster_0/cluster_size:120.0 - frontier/cluster_1/frontier:16.0 - frontier/cluster_1/score:1.91 - frontier/cluster_1/cluster_size:138.0 - frontier/cluster_2/frontier:0.0 - frontier/cluster_2/score:2.3 - frontier/cluster_2/cluster_size:197.0 - frontier/cluster_3/frontier:0.0 - frontier/cluster_3/score:2.0 - frontier/cluster_3/cluster_size:147.0 - frontier/cluster_4/frontier:0.0 - frontier/cluster_4/score:2.0 - frontier/cluster_4/cluster_size:62.0 - frontier/cluster_5/frontier:16.0 - frontier/cluster_5/score:1.91 - frontier/cluster_5/cluster_size:238.0 - frontier/cluster_6/frontier:0.0 - frontier/cluster_6/score:2.0 - frontier/cluster_6/cluster_size:319.0 - frontier/cluster_7/frontier:0.0 - frontier/cluster_7/score:2.3 - frontier/cluster_7/cluster_size:139.0 - frontier/cluster_8/frontier:0.0 - frontier/cluster_8/score:2.0 - frontier/cluster_8/cluster_size:241.0 - frontier/cluster_9/frontier:16.0 - frontier/cluster_9/score:2.51 - frontier/cluster_9/cluster_size:175.0 - frontier/cluster_10/frontier:16.0 - frontier/cluster_10/score:2.9 - frontier/cluster_10/cluster_size:220.0 - frontier/cluster_11/frontier:16.0 - frontier/cluster_11/score:2.51 - frontier/cluster_11/cluster_size:190.0 - frontier/cluster_12/frontier:16.0 - frontier/cluster_12/score:1.7 - frontier/cluster_12/cluster_size:249.0 - frontier/cluster_13/frontier:32.0 - frontier/cluster_13/score:2.9656999999999996 - frontier/cluster_13/cluster_size:82.0 - frontier/cluster_14/frontier:0.0 - frontier/cluster_14/score:2.3 - frontier/cluster_14/cluster_size:208.0 - frontier/cluster_15/frontier:16.0 - frontier/cluster_15/score:3.11 - frontier/cluster_15/cluster_size:161.0 - frontier/cluster_16/frontier:16.0 - frontier/cluster_16/score:2.51 - frontier/cluster_16/cluster_size:150.0 - frontier/cluster_17/frontier:0.0 - frontier/cluster_17/score:2.3 - frontier/cluster_17/cluster_size:193.0 - frontier/cluster_18/frontier:16.0 - frontier/cluster_18/score:2.51 - frontier/cluster_18/cluster_size:216.0 - frontier/cluster_19/frontier:0.0 - frontier/cluster_19/score:2.3 - frontier/cluster_19/cluster_size:127.0 - frontier/cluster_20/frontier:0.0 - frontier/cluster_20/score:2.0 - frontier/cluster_20/cluster_size:136.0 - frontier/cluster_21/frontier:0.0 - frontier/cluster_21/score:2.0 - frontier/cluster_21/cluster_size:183.0 - frontier/cluster_22/frontier:16.0 - frontier/cluster_22/score:1.7 - frontier/cluster_22/cluster_size:233.0 - frontier/cluster_23/frontier:32.0 - frontier/cluster_23/score:3.3598999999999997 - frontier/cluster_23/cluster_size:227.0 - frontier/cluster_24/frontier:16.0 - frontier/cluster_24/score:3.11 - frontier/cluster_24/cluster_size:97.0 - frontier/cluster_25/frontier:0.0 - frontier/cluster_25/score:2.0 - frontier/cluster_25/cluster_size:298.0 - frontier/cluster_26/frontier:16.0 - frontier/cluster_26/score:1.91 - frontier/cluster_26/cluster_size:213.0 - frontier/cluster_27/frontier:16.0 - frontier/cluster_27/score:3.11 - frontier/cluster_27/cluster_size:213.0 - frontier/cluster_28/frontier:0.0 - frontier/cluster_28/score:2.3 - frontier/cluster_28/cluster_size:126.0 - frontier/cluster_29/frontier:16.0 - frontier/cluster_29/score:1.7 - frontier/cluster_29/cluster_size:48.0 - frontier/cluster_30/frontier:0.0 - frontier/cluster_30/score:2.0 - frontier/cluster_30/cluster_size:226.0 - frontier/cluster_31/frontier:0.0 - frontier/cluster_31/score:2.0 - frontier/cluster_31/cluster_size:121.0 - frontier/cluster_32/frontier:16.0 - frontier/cluster_32/score:1.7 - frontier/cluster_32/cluster_size:270.0 - frontier/cluster_33/frontier:16.0 - frontier/cluster_33/score:2.51 - frontier/cluster_33/cluster_size:171.0 - frontier/cluster_34/frontier:0.0 - frontier/cluster_34/score:2.0 - frontier/cluster_34/cluster_size:121.0 - frontier/cluster_35/frontier:16.0 - frontier/cluster_35/score:2.9 - frontier/cluster_35/cluster_size:184.0 - frontier/cluster_36/frontier:0.0 - frontier/cluster_36/score:2.0 - frontier/cluster_36/cluster_size:108.0 - frontier/cluster_37/frontier:16.0 - frontier/cluster_37/score:2.51 - frontier/cluster_37/cluster_size:193.0 - frontier/cluster_38/frontier:0.0 - frontier/cluster_38/score:2.3 - frontier/cluster_38/cluster_size:164.0 - frontier/cluster_39/frontier:16.0 - frontier/cluster_39/score:2.6569999999999996 - frontier/cluster_39/cluster_size:246.0 - frontier/cluster_40/frontier:16.0 - frontier/cluster_40/score:2.9 - frontier/cluster_40/cluster_size:102.0 - frontier/cluster_41/frontier:0.0 - frontier/cluster_41/score:2.3 - frontier/cluster_41/cluster_size:190.0 - frontier/cluster_42/frontier:0.0 - frontier/cluster_42/score:2.0 - frontier/cluster_42/cluster_size:392.0 - frontier/cluster_43/frontier:16.0 - frontier/cluster_43/score:2.9 - frontier/cluster_43/cluster_size:209.0 - frontier/cluster_44/frontier:16.0 - frontier/cluster_44/score:1.7 - frontier/cluster_44/cluster_size:197.0 - frontier/cluster_45/frontier:32.0 - frontier/cluster_45/score:2.3629999999999995 - frontier/cluster_45/cluster_size:240.0 - frontier/cluster_46/frontier:0.0 - frontier/cluster_46/score:2.3 - frontier/cluster_46/cluster_size:191.0 - frontier/cluster_47/frontier:32.0 - frontier/cluster_47/score:3.2569999999999997 - frontier/cluster_47/cluster_size:155.0 - frontier/cluster_48/frontier:0.0 - frontier/cluster_48/score:2.0 - frontier/cluster_48/cluster_size:302.0 - frontier/cluster_49/frontier:16.0 - frontier/cluster_49/score:2.51 - frontier/cluster_49/cluster_size:219.0 - frontier/cluster_50/frontier:16.0 - frontier/cluster_50/score:1.7 - frontier/cluster_50/cluster_size:228.0 - frontier/cluster_51/frontier:0.0 - frontier/cluster_51/score:2.3 - frontier/cluster_51/cluster_size:216.0 - frontier/cluster_52/frontier:16.0 - frontier/cluster_52/score:1.7 - frontier/cluster_52/cluster_size:91.0 - frontier/cluster_53/frontier:16.0 - frontier/cluster_53/score:1.7 - frontier/cluster_53/cluster_size:300.0 - frontier/cluster_54/frontier:16.0 - frontier/cluster_54/score:2.6569999999999996 - frontier/cluster_54/cluster_size:122.0 - frontier/cluster_55/frontier:16.0 - frontier/cluster_55/score:2.51 - frontier/cluster_55/cluster_size:99.0 - frontier/cluster_56/frontier:16.0 - frontier/cluster_56/score:2.6569999999999996 - frontier/cluster_56/cluster_size:133.0 - frontier/cluster_57/frontier:0.0 - frontier/cluster_57/score:2.3 - frontier/cluster_57/cluster_size:209.0 - frontier/cluster_58/frontier:0.0 - frontier/cluster_58/score:2.3 - frontier/cluster_58/cluster_size:105.0 - frontier/cluster_59/frontier:0.0 - frontier/cluster_59/score:2.3 - frontier/cluster_59/cluster_size:174.0 - frontier/cluster_60/frontier:16.0 - frontier/cluster_60/score:1.7 - frontier/cluster_60/cluster_size:38.0 - frontier/cluster_61/frontier:0.0 - frontier/cluster_61/score:2.3 - frontier/cluster_61/cluster_size:54.0 - frontier/cluster_62/frontier:0.0 - frontier/cluster_62/score:2.0 - frontier/cluster_62/cluster_size:70.0 - frontier/cluster_63/frontier:0.0 - frontier/cluster_63/score:2.3 - frontier/cluster_63/cluster_size:261.0 - cluster/prob_snapshot_step:370.0 - cluster/prob_snapshot/cluster_0:0.01968015005775106 - cluster/prob_snapshot/cluster_1:0.012961754003553285 - cluster/prob_snapshot/cluster_2:0.015608394873388772 - cluster/prob_snapshot/cluster_3:0.01357251728120763 - cluster/prob_snapshot/cluster_4:0.01357251728120763 - cluster/prob_snapshot/cluster_5:0.012961754003553285 - cluster/prob_snapshot/cluster_6:0.01357251728120763 - cluster/prob_snapshot/cluster_7:0.015608394873388772 - cluster/prob_snapshot/cluster_8:0.01357251728120763 - cluster/prob_snapshot/cluster_9:0.017033509187915574 - cluster/prob_snapshot/cluster_10:0.01968015005775106 - cluster/prob_snapshot/cluster_11:0.017033509187915574 - cluster/prob_snapshot/cluster_12:0.011536639689026485 - cluster/prob_snapshot/cluster_13:0.02012600725043873 - cluster/prob_snapshot/cluster_14:0.015608394873388772 - cluster/prob_snapshot/cluster_15:0.021105264372277863 - cluster/prob_snapshot/cluster_16:0.017033509187915574 - cluster/prob_snapshot/cluster_17:0.015608394873388772 - cluster/prob_snapshot/cluster_18:0.017033509187915574 - cluster/prob_snapshot/cluster_19:0.015608394873388772 - cluster/prob_snapshot/cluster_20:0.01357251728120763 - cluster/prob_snapshot/cluster_21:0.01357251728120763 - cluster/prob_snapshot/cluster_22:0.011536639689026485 - cluster/prob_snapshot/cluster_23:0.022801150406564757 - cluster/prob_snapshot/cluster_24:0.021105264372277863 - cluster/prob_snapshot/cluster_25:0.01357251728120763 - cluster/prob_snapshot/cluster_26:0.012961754003553285 - cluster/prob_snapshot/cluster_27:0.021105264372277863 - cluster/prob_snapshot/cluster_28:0.015608394873388772 - cluster/prob_snapshot/cluster_29:0.011536639689026485 - cluster/prob_snapshot/cluster_30:0.01357251728120763 - cluster/prob_snapshot/cluster_31:0.01357251728120763 - cluster/prob_snapshot/cluster_32:0.011536639689026485 - cluster/prob_snapshot/cluster_33:0.017033509187915574 - cluster/prob_snapshot/cluster_34:0.01357251728120763 - cluster/prob_snapshot/cluster_35:0.01968015005775106 - cluster/prob_snapshot/cluster_36:0.01357251728120763 - cluster/prob_snapshot/cluster_37:0.017033509187915574 - cluster/prob_snapshot/cluster_38:0.015608394873388772 - cluster/prob_snapshot/cluster_39:0.018031089208084335 - cluster/prob_snapshot/cluster_40:0.01968015005775106 - cluster/prob_snapshot/cluster_41:0.015608394873388772 - cluster/prob_snapshot/cluster_42:0.01357251728120763 - cluster/prob_snapshot/cluster_43:0.01968015005775106 - cluster/prob_snapshot/cluster_44:0.011536639689026485 - cluster/prob_snapshot/cluster_45:0.01603592916774681 - cluster/prob_snapshot/cluster_46:0.015608394873388772 - cluster/prob_snapshot/cluster_47:0.022102844392446624 - cluster/prob_snapshot/cluster_48:0.01357251728120763 - cluster/prob_snapshot/cluster_49:0.017033509187915574 - cluster/prob_snapshot/cluster_50:0.011536639689026485 - cluster/prob_snapshot/cluster_51:0.015608394873388772 - cluster/prob_snapshot/cluster_52:0.011536639689026485 - cluster/prob_snapshot/cluster_53:0.011536639689026485 - cluster/prob_snapshot/cluster_54:0.018031089208084335 - cluster/prob_snapshot/cluster_55:0.017033509187915574 - cluster/prob_snapshot/cluster_56:0.018031089208084335 - cluster/prob_snapshot/cluster_57:0.015608394873388772 - cluster/prob_snapshot/cluster_58:0.015608394873388772 - cluster/prob_snapshot/cluster_59:0.015608394873388772 - cluster/prob_snapshot/cluster_60:0.011536639689026485 - cluster/prob_snapshot/cluster_61:0.015608394873388772 - cluster/prob_snapshot/cluster_62:0.01357251728120763 - cluster/prob_snapshot/cluster_63:0.015608394873388772
[36m(RewardLoopWorker pid=2826754)[0m WARNING:2026-04-13 00:35:09,549:WARNING: Error in configuration: macro '\frac' failed its substitution!