Junyi42 commited on
Commit
b64f20d
·
verified ·
1 Parent(s): 1a20940

Upload checkpoints_vlm_gym_jigsaw_one_image_lr2e_5_mse_only_ema9999_hashed/checkpoints_vlm_gym_jigsaw_one_image_lr2e_5_mse_only_ema9999_hashed

Browse files
checkpoints_vlm_gym_jigsaw_one_image_lr2e_5_mse_only_ema9999_hashed/checkpoints_vlm_gym_jigsaw_one_image_lr2e_5_mse_only_ema9999_hashed/wandb/offline-run-20260111_235518-vlm_gym_jigsaw_one_img_lr2e_5_mse_only_ema9999_hashed-run0/files/config.yaml CHANGED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ wandb_version: 1
2
+
3
+ _wandb:
4
+ desc: null
5
+ value:
6
+ python_version: 3.11.10
7
+ cli_version: 0.23.1
8
+ framework: huggingface
9
+ huggingface_version: 4.49.0
10
+ is_jupyter_run: false
11
+ is_kaggle_kernel: false
12
+ start_time: 1768175718
13
+ t:
14
+ 1:
15
+ - 1
16
+ - 5
17
+ - 11
18
+ - 41
19
+ - 49
20
+ - 53
21
+ - 71
22
+ - 105
23
+ 2:
24
+ - 1
25
+ - 5
26
+ - 11
27
+ - 41
28
+ - 49
29
+ - 53
30
+ - 71
31
+ - 105
32
+ 3:
33
+ - 4
34
+ - 13
35
+ - 14
36
+ - 37
37
+ - 42
38
+ 4: 3.11.10
39
+ 5: 0.23.1
40
+ 6: 4.49.0
41
+ 13: linux-x86_64
42
+ e:
43
+ p78nkupquzvg4sp7iakb8i8e0ffje8id:
44
+ os: Linux-6.6.93+-x86_64-with-glibc2.35
45
+ python: CPython 3.11.10
46
+ started_at: '2026-01-11T23:55:18.222054Z'
47
+ args:
48
+ - --dataset_config_file
49
+ - ./data/configs/vlm_gym_jigsaw_train_mseloss_only.yaml
50
+ - --eval_dataset_config_file
51
+ - ./data/configs/vlm_gym_jigsaw_train_mseloss_only.yaml
52
+ - --viz_dataset_config_file
53
+ - ./data/configs/vlm_gym_jigsaw_train_mseloss_only.yaml
54
+ - --inference_hash_file
55
+ - /home/clouduser/Code/Github/launch_new/hashes_test_set_v10.json
56
+ - --train_data_dir
57
+ - /home/clouduser/Code/data/gym/jigsaw-swap_v5/train/
58
+ - --train_jsonl_path
59
+ - /home/clouduser/Code/data/gym/jigsaw-swap_v5/train/
60
+ - --eval_data_dir
61
+ - /home/clouduser/Code/data/gym/jigsaw-swap_v5/val/
62
+ - --eval_jsonl_path
63
+ - /home/clouduser/Code/data/gym/jigsaw-swap_v5/val/
64
+ - --model_path
65
+ - /home/clouduser/Code/Models/BAGEL-7B-MoT
66
+ - --layer_module
67
+ - Qwen2MoTDecoderLayer
68
+ - --max_latent_size
69
+ - '64'
70
+ - --resume-from
71
+ - /home/clouduser/Code/Models/BAGEL-7B-MoT
72
+ - --finetune_from_hf
73
+ - 'True'
74
+ - --auto_resume
75
+ - 'False'
76
+ - --resume-model-only
77
+ - 'True'
78
+ - --finetune-from-ema
79
+ - 'True'
80
+ - --log_every
81
+ - '1'
82
+ - --lr
83
+ - 2e-5
84
+ - --warmup_steps
85
+ - '300'
86
+ - --lr_scheduler
87
+ - cosine
88
+ - --num_worker
89
+ - '1'
90
+ - --expected_num_tokens
91
+ - '20000'
92
+ - --max_num_tokens
93
+ - '20000'
94
+ - --max_num_tokens_per_sample
95
+ - '20000'
96
+ - --visual_und
97
+ - 'True'
98
+ - --save_every
99
+ - '2500'
100
+ - --total_steps
101
+ - '5000'
102
+ - --text_cond_dropout_prob
103
+ - '0.0'
104
+ - --vae_cond_dropout_prob
105
+ - '0.0'
106
+ - --vit_cond_dropout_prob
107
+ - '0.0'
108
+ - --ema
109
+ - '0.9999'
110
+ - --checkpoint_dir
111
+ - /dev/shm/models/checkpoints_vlm_gym_jigsaw_one_image_lr2e_5_mse_only_ema9999_hashed
112
+ - --wandb_project
113
+ - bagel
114
+ - --wandb_name
115
+ - vlm_gym_jigsaw_one_img_lr2e_5_mse_only_ema9999_hashed
116
+ - --wandb_dir
117
+ - /dev/shm/models/checkpoints_vlm_gym_jigsaw_one_image_lr2e_5_mse_only_ema9999_hashed
118
+ - --wandb_offline
119
+ - 'True'
120
+ program: /home/clouduser/Code/Github/unified_world_model/train/pretrain_unified_navit.py
121
+ code_path: train/pretrain_unified_navit.py
122
+ code_path_local: train/pretrain_unified_navit.py
123
+ git:
124
+ remote_url: https://github.com/para-lost/unified_world_model
125
+ commit: b61f53a430c22fa4a17c69d2f903ce1fe266f48e
126
+ root: /dev/shm/models/checkpoints_vlm_gym_jigsaw_one_image_lr2e_5_mse_only_ema9999_hashed
127
+ host: junyizhang-launch-new-221746038-1-0
128
+ executable: /opt/conda/bin/python3.11
129
+ cpu_count: 48
130
+ cpu_count_logical: 96
131
+ gpu_type: NVIDIA A100-SXM4-80GB
132
+ gpu_count: 8
133
+ disk:
134
+ /:
135
+ total: '1052461830144'
136
+ used: '261625344000'
137
+ memory:
138
+ total: '1437332606976'
139
+ gpu_nvidia:
140
+ - name: NVIDIA A100-SXM4-80GB
141
+ memory_total: '85899345920'
142
+ cuda_cores: 6912
143
+ architecture: Ampere
144
+ uuid: GPU-5c8a9afd-07c9-5b67-4b7d-2d070676ab83
145
+ - name: NVIDIA A100-SXM4-80GB
146
+ memory_total: '85899345920'
147
+ cuda_cores: 6912
148
+ architecture: Ampere
149
+ uuid: GPU-f44fe102-70f0-4db8-5fd1-c30bfd0f9bb8
150
+ - name: NVIDIA A100-SXM4-80GB
151
+ memory_total: '85899345920'
152
+ cuda_cores: 6912
153
+ architecture: Ampere
154
+ uuid: GPU-97f9aee6-0f04-ce00-5732-7d6cab1f9170
155
+ - name: NVIDIA A100-SXM4-80GB
156
+ memory_total: '85899345920'
157
+ cuda_cores: 6912
158
+ architecture: Ampere
159
+ uuid: GPU-e645965f-7741-e8d5-63df-bf1e988d5549
160
+ - name: NVIDIA A100-SXM4-80GB
161
+ memory_total: '85899345920'
162
+ cuda_cores: 6912
163
+ architecture: Ampere
164
+ uuid: GPU-4123c973-b472-4c08-392e-9df433596e67
165
+ - name: NVIDIA A100-SXM4-80GB
166
+ memory_total: '85899345920'
167
+ cuda_cores: 6912
168
+ architecture: Ampere
169
+ uuid: GPU-8a672f68-5ec7-eb5e-8f00-748126c6ce99
170
+ - name: NVIDIA A100-SXM4-80GB
171
+ memory_total: '85899345920'
172
+ cuda_cores: 6912
173
+ architecture: Ampere
174
+ uuid: GPU-81b788fb-644b-899f-258e-b2ac8ffffb2b
175
+ - name: NVIDIA A100-SXM4-80GB
176
+ memory_total: '85899345920'
177
+ cuda_cores: 6912
178
+ architecture: Ampere
179
+ uuid: GPU-15e8faa7-0d4d-86aa-05db-cf5fb77e5111
180
+ cuda_version: '12.2'
181
+ writer_id: p78nkupquzvg4sp7iakb8i8e0ffje8id