--- library_name: transformers license: other base_model: /data2/wuxinrui/RoboBrain2.0/HF_Models/BAAI-RoboBrain2.0-7B tags: - llama-factory - full - generated_from_trainer model-index: - name: output_1 results: [] --- # output_1 This model is a fine-tuned version of [/data2/wuxinrui/RoboBrain2.0/HF_Models/BAAI-RoboBrain2.0-7B](https://huggingface.co//data2/wuxinrui/RoboBrain2.0/HF_Models/BAAI-RoboBrain2.0-7B) on the COT_1_shorten_shorten2_budgetthinker_abalation_mllm, the COT_1_shorten_budgetthinker_abalation_mllm, the COT_1_budgetthinker_abalation_mllm, the COT_2_shorten_shorten2_budgetthinker_abalation_mllm, the COT_2_shorten_budgetthinker_abalation_mllm, the COT_2_budgetthinker_abalation_mllm, the COT_3_shorten_shorten2_budgetthinker_abalation_mllm, the COT_3_shorten_budgetthinker_abalation_mllm, the COT_3_budgetthinker_abalation_mllm, the COT_4_shorten_shorten2_budgetthinker_abalation_mllm, the COT_4_shorten_budgetthinker_abalation_mllm, the COT_4_budgetthinker_abalation_mllm, the COT_5_shorten_shorten2_budgetthinker_abalation_mllm, the COT_5_shorten_budgetthinker_abalation_mllm, the COT_5_budgetthinker_abalation_mllm, the COT_6_shorten_shorten2_budgetthinker_abalation_mllm, the COT_6_shorten_budgetthinker_abalation_mllm, the COT_6_budgetthinker_abalation_mllm, the COT_8_shorten_shorten2_budgetthinker_abalation_mllm, the COT_8_shorten_budgetthinker_abalation_mllm, the COT_8_budgetthinker_abalation_mllm, the COT_9_shorten_shorten2_budgetthinker_abalation_mllm, the COT_9_shorten_budgetthinker_abalation_mllm, the COT_9_budgetthinker_abalation_mllm, the COT_10_shorten_shorten2_budgetthinker_abalation_mllm, the COT_10_shorten_budgetthinker_abalation_mllm and the COT_10_budgetthinker_abalation_mllm datasets. ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-06 - train_batch_size: 2 - eval_batch_size: 8 - seed: 42 - distributed_type: multi-GPU - num_devices: 6 - gradient_accumulation_steps: 2 - total_train_batch_size: 24 - total_eval_batch_size: 48 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 2.0 ### Training results ### Framework versions - Transformers 4.51.3 - Pytorch 2.8.0+cu128 - Datasets 3.6.0 - Tokenizers 0.21.1