Muqeeth's picture
Add files using upload-large-folder tool
6b42e1b verified
[2026-03-24 17:43:34,672][mllm.models.large_language_model_local][INFO] - Initializing adapter 'agent_adapter': no initial weights provided or found; starting from scratch.
[2026-03-24 17:43:35,468][mllm.models.adapter_training_wrapper][INFO] - Adapter 'agent_adapter': initialized with fresh weights (no initial weights found).
[2026-03-24 17:43:35,474][mllm.models.large_language_model_local][INFO] - Initializing adapter 'critic_adapter': no initial weights provided or found; starting from scratch.
[2026-03-24 17:43:36,290][mllm.models.adapter_training_wrapper][INFO] - Adapter 'critic_adapter': initialized with fresh weights (no initial weights found).
[2026-03-24 17:46:27,033][__main__][INFO] - Starting iteration 0.
[2026-03-24 17:46:27,040][__main__][INFO] - Inference policies count is regular policies 2 and buffer policies 0 and human policies 1.
[2026-03-24 17:46:27,040][__main__][INFO] - Hard coded buffer agents are set to False with prob 0
[2026-03-24 17:46:36,051][__main__][INFO] - Number of regex retries in iteration 0: 0
[2026-03-24 17:46:36,052][__main__][INFO] - agents played in iteration 0 are Bob, Alice_buffer, Alice, Bob_buffer
[2026-03-24 17:48:02,370][mllm.training.trainer_ad_align][INFO] - For task: Get advantages with critic gradient accumulation, ΔVRAM % (total): 0.00%, Current % of VRAM taken: 37.33%, Block Peak % of device VRAM: 18.62%, ΔTime: 00:01:25