booth-pic-api / backend_error.log
github-actions
Deploy to HF (clean history with LFS)
a06f06c
Loading weights: 0%| | 0/398 [00:00<?, ?it/s] Loading weights: 0%| | 1/398 [00:00<00:00, 15420.24it/s, Materializing param=logit_scale] Loading weights: 0%| | 1/398 [00:00<00:00, 4245.25it/s, Materializing param=logit_scale] Loading weights: 1%| | 2/398 [00:00<00:00, 4090.01it/s, Materializing param=text_model.embeddings.position_embedding.weight] Loading weights: 1%| | 2/398 [00:00<00:00, 3133.59it/s, Materializing param=text_model.embeddings.position_embedding.weight] Loading weights: 1%| | 3/398 [00:00<00:00, 3492.34it/s, Materializing param=text_model.embeddings.token_embedding.weight] Loading weights: 1%| | 3/398 [00:00<00:00, 3044.50it/s, Materializing param=text_model.embeddings.token_embedding.weight] Loading weights: 1%|1 | 4/398 [00:00<00:00, 3407.93it/s, Materializing param=text_model.encoder.layers.0.layer_norm1.bias] Loading weights: 1%|1 | 4/398 [00:00<00:00, 3049.29it/s, Materializing param=text_model.encoder.layers.0.layer_norm1.bias] Loading weights: 1%|1 | 5/398 [00:00<00:00, 3274.75it/s, Materializing param=text_model.encoder.layers.0.layer_norm1.weight] Loading weights: 1%|1 | 5/398 [00:00<00:00, 2919.20it/s, Materializing param=text_model.encoder.layers.0.layer_norm1.weight] Loading weights: 2%|1 | 6/398 [00:00<00:00, 3128.91it/s, Materializing param=text_model.encoder.layers.0.layer_norm2.bias] Loading weights: 2%|1 | 6/398 [00:00<00:00, 2933.76it/s, Materializing param=text_model.encoder.layers.0.layer_norm2.bias] Loading weights: 2%|1 | 7/398 [00:00<00:00, 3167.90it/s, Materializing param=text_model.encoder.layers.0.layer_norm2.weight] Loading weights: 2%|1 | 7/398 [00:00<00:00, 3009.75it/s, Materializing param=text_model.encoder.layers.0.layer_norm2.weight] Loading weights: 2%|2 | 8/398 [00:00<00:00, 3096.28it/s, Materializing param=text_model.encoder.layers.0.mlp.fc1.bias] Loading weights: 2%|2 | 8/398 [00:00<00:00, 2838.30it/s, Materializing param=text_model.encoder.layers.0.mlp.fc1.bias] Loading weights: 2%|2 | 9/398 [00:00<00:00, 2929.44it/s, Materializing param=text_model.encoder.layers.0.mlp.fc1.weight] Loading weights: 2%|2 | 9/398 [00:00<00:00, 2814.97it/s, Materializing param=text_model.encoder.layers.0.mlp.fc1.weight] Loading weights: 3%|2 | 10/398 [00:00<00:00, 2976.79it/s, Materializing param=text_model.encoder.layers.0.mlp.fc2.bias] Loading weights: 3%|2 | 10/398 [00:00<00:00, 2870.45it/s, Materializing param=text_model.encoder.layers.0.mlp.fc2.bias] Loading weights: 3%|2 | 11/398 [00:00<00:00, 3015.32it/s, Materializing param=text_model.encoder.layers.0.mlp.fc2.weight] Loading weights: 3%|2 | 11/398 [00:00<00:00, 2919.90it/s, Materializing param=text_model.encoder.layers.0.mlp.fc2.weight] Loading weights: 3%|3 | 12/398 [00:00<00:00, 3059.49it/s, Materializing param=text_model.encoder.layers.0.self_attn.k_proj.bias] Loading weights: 3%|3 | 12/398 [00:00<00:00, 2975.91it/s, Materializing param=text_model.encoder.layers.0.self_attn.k_proj.bias] Loading weights: 3%|3 | 13/398 [00:00<00:00, 3015.65it/s, Materializing param=text_model.encoder.layers.0.self_attn.k_proj.weight] Loading weights: 3%|3 | 13/398 [00:00<00:00, 2858.50it/s, Materializing param=text_model.encoder.layers.0.self_attn.k_proj.weight] Loading weights: 4%|3 | 14/398 [00:00<00:00, 2855.63it/s, Materializing param=text_model.encoder.layers.0.self_attn.out_proj.bias] Loading weights: 4%|3 | 14/398 [00:00<00:00, 2713.13it/s, Materializing param=text_model.encoder.layers.0.self_attn.out_proj.bias] Loading weights: 4%|3 | 15/398 [00:00<00:00, 2702.05it/s, Materializing param=text_model.encoder.layers.0.self_attn.out_proj.weight] Loading weights: 4%|3 | 15/398 [00:00<00:00, 2620.24it/s, Materializing param=text_model.encoder.layers.0.self_attn.out_proj.weight] Loading weights: 4%|4 | 16/398 [00:00<00:00, 2687.36it/s, Materializing param=text_model.encoder.layers.0.self_attn.q_proj.bias] Loading weights: 4%|4 | 16/398 [00:00<00:00, 2505.84it/s, Materializing param=text_model.encoder.layers.0.self_attn.q_proj.bias] Loading weights: 4%|4 | 17/398 [00:00<00:00, 2536.13it/s, Materializing param=text_model.encoder.layers.0.self_attn.q_proj.weight] Loading weights: 4%|4 | 17/398 [00:00<00:00, 2449.52it/s, Materializing param=text_model.encoder.layers.0.self_attn.q_proj.weight] Loading weights: 5%|4 | 18/398 [00:00<00:00, 2486.09it/s, Materializing param=text_model.encoder.layers.0.self_attn.v_proj.bias] Loading weights: 5%|4 | 18/398 [00:00<00:00, 2409.75it/s, Materializing param=text_model.encoder.layers.0.self_attn.v_proj.bias] Loading weights: 5%|4 | 19/398 [00:00<00:00, 2461.67it/s, Materializing param=text_model.encoder.layers.0.self_attn.v_proj.weight] Loading weights: 5%|4 | 19/398 [00:00<00:00, 2406.52it/s, Materializing param=text_model.encoder.layers.0.self_attn.v_proj.weight] Loading weights: 5%|5 | 20/398 [00:00<00:00, 2466.29it/s, Materializing param=text_model.encoder.layers.1.layer_norm1.bias] Loading weights: 5%|5 | 20/398 [00:00<00:00, 2405.61it/s, Materializing param=text_model.encoder.layers.1.layer_norm1.bias] Loading weights: 5%|5 | 21/398 [00:00<00:00, 2458.90it/s, Materializing param=text_model.encoder.layers.1.layer_norm1.weight] Loading weights: 5%|5 | 21/398 [00:00<00:00, 2413.76it/s, Materializing param=text_model.encoder.layers.1.layer_norm1.weight] Loading weights: 6%|5 | 22/398 [00:00<00:00, 2461.18it/s, Materializing param=text_model.encoder.layers.1.layer_norm2.bias] Loading weights: 6%|5 | 22/398 [00:00<00:00, 2398.55it/s, Materializing param=text_model.encoder.layers.1.layer_norm2.bias] Loading weights: 6%|5 | 23/398 [00:00<00:00, 2448.95it/s, Materializing param=text_model.encoder.layers.1.layer_norm2.weight] Loading weights: 6%|5 | 23/398 [00:00<00:00, 2399.49it/s, Materializing param=text_model.encoder.layers.1.layer_norm2.weight] Loading weights: 6%|6 | 24/398 [00:00<00:00, 2454.42it/s, Materializing param=text_model.encoder.layers.1.mlp.fc1.bias] Loading weights: 6%|6 | 24/398 [00:00<00:00, 2319.11it/s, Materializing param=text_model.encoder.layers.1.mlp.fc1.bias] Loading weights: 6%|6 | 25/398 [00:00<00:00, 2224.86it/s, Materializing param=text_model.encoder.layers.1.mlp.fc1.weight] Loading weights: 6%|6 | 25/398 [00:00<00:00, 2122.41it/s, Materializing param=text_model.encoder.layers.1.mlp.fc1.weight] Loading weights: 7%|6 | 26/398 [00:00<00:00, 2125.27it/s, Materializing param=text_model.encoder.layers.1.mlp.fc2.bias] Loading weights: 7%|6 | 26/398 [00:00<00:00, 2094.90it/s, Materializing param=text_model.encoder.layers.1.mlp.fc2.bias] Loading weights: 7%|6 | 27/398 [00:00<00:00, 2121.07it/s, Materializing param=text_model.encoder.layers.1.mlp.fc2.weight] Loading weights: 7%|6 | 27/398 [00:00<00:00, 2087.10it/s, Materializing param=text_model.encoder.layers.1.mlp.fc2.weight] Loading weights: 7%|7 | 28/398 [00:00<00:00, 2126.08it/s, Materializing param=text_model.encoder.layers.1.self_attn.k_proj.bias] Loading weights: 7%|7 | 28/398 [00:00<00:00, 2097.71it/s, Materializing param=text_model.encoder.layers.1.self_attn.k_proj.bias] Loading weights: 7%|7 | 29/398 [00:00<00:00, 2141.80it/s, Materializing param=text_model.encoder.layers.1.self_attn.k_proj.weight] Loading weights: 7%|7 | 29/398 [00:00<00:00, 2119.85it/s, Materializing param=text_model.encoder.layers.1.self_attn.k_proj.weight] Loading weights: 8%|7 | 30/398 [00:00<00:00, 2154.79it/s, Materializing param=text_model.encoder.layers.1.self_attn.out_proj.bias] Loading weights: 8%|7 | 30/398 [00:00<00:00, 2106.88it/s, Materializing param=text_model.encoder.layers.1.self_attn.out_proj.bias] Loading weights: 8%|7 | 31/398 [00:00<00:00, 2149.18it/s, Materializing param=text_model.encoder.layers.1.self_attn.out_proj.weight] Loading weights: 8%|7 | 31/398 [00:00<00:00, 2129.79it/s, Materializing param=text_model.encoder.layers.1.self_attn.out_proj.weight] Loading weights: 8%|8 | 32/398 [00:00<00:00, 2173.50it/s, Materializing param=text_model.encoder.layers.1.self_attn.q_proj.bias] Loading weights: 8%|8 | 32/398 [00:00<00:00, 2156.87it/s, Materializing param=text_model.encoder.layers.1.self_attn.q_proj.bias] Loading weights: 8%|8 | 33/398 [00:00<00:00, 2201.24it/s, Materializing param=text_model.encoder.layers.1.self_attn.q_proj.weight] Loading weights: 8%|8 | 33/398 [00:00<00:00, 2184.33it/s, Materializing param=text_model.encoder.layers.1.self_attn.q_proj.weight] Loading weights: 9%|8 | 34/398 [00:00<00:00, 2227.63it/s, Materializing param=text_model.encoder.layers.1.self_attn.v_proj.bias] Loading weights: 9%|8 | 34/398 [00:00<00:00, 2211.33it/s, Materializing param=text_model.encoder.layers.1.self_attn.v_proj.bias] Loading weights: 9%|8 | 35/398 [00:00<00:00, 2251.85it/s, Materializing param=text_model.encoder.layers.1.self_attn.v_proj.weight] Loading weights: 9%|8 | 35/398 [00:00<00:00, 2224.93it/s, Materializing param=text_model.encoder.layers.1.self_attn.v_proj.weight] Loading weights: 9%|9 | 36/398 [00:00<00:00, 2253.62it/s, Materializing param=text_model.encoder.layers.2.layer_norm1.bias] Loading weights: 9%|9 | 36/398 [00:00<00:00, 2227.66it/s, Materializing param=text_model.encoder.layers.2.layer_norm1.bias] Loading weights: 9%|9 | 37/398 [00:00<00:00, 2262.66it/s, Materializing param=text_model.encoder.layers.2.layer_norm1.weight] Loading weights: 9%|9 | 37/398 [00:00<00:00, 2234.26it/s, Materializing param=text_model.encoder.layers.2.layer_norm1.weight] Loading weights: 10%|9 | 38/398 [00:00<00:00, 2260.34it/s, Materializing param=text_model.encoder.layers.2.layer_norm2.bias] Loading weights: 10%|9 | 38/398 [00:00<00:00, 2233.98it/s, Materializing param=text_model.encoder.layers.2.layer_norm2.bias] Loading weights: 10%|9 | 39/398 [00:00<00:00, 2254.26it/s, Materializing param=text_model.encoder.layers.2.layer_norm2.weight] Loading weights: 10%|9 | 39/398 [00:00<00:00, 2234.27it/s, Materializing param=text_model.encoder.layers.2.layer_norm2.weight] Loading weights: 10%|# | 40/398 [00:00<00:00, 2267.19it/s, Materializing param=text_model.encoder.layers.2.mlp.fc1.bias] Loading weights: 10%|# | 40/398 [00:00<00:00, 2251.04it/s, Materializing param=text_model.encoder.layers.2.mlp.fc1.bias] Loading weights: 10%|# | 41/398 [00:00<00:00, 2285.42it/s, Materializing param=text_model.encoder.layers.2.mlp.fc1.weight] Loading weights: 10%|# | 41/398 [00:00<00:00, 2264.12it/s, Materializing param=text_model.encoder.layers.2.mlp.fc1.weight] Loading weights: 11%|# | 42/398 [00:00<00:00, 2291.55it/s, Materializing param=text_model.encoder.layers.2.mlp.fc2.bias] Loading weights: 11%|# | 42/398 [00:00<00:00, 2273.54it/s, Materializing param=text_model.encoder.layers.2.mlp.fc2.bias] Loading weights: 11%|# | 43/398 [00:00<00:00, 2297.90it/s, Materializing param=text_model.encoder.layers.2.mlp.fc2.weight] Loading weights: 11%|# | 43/398 [00:00<00:00, 2278.30it/s, Materializing param=text_model.encoder.layers.2.mlp.fc2.weight] Loading weights: 11%|#1 | 44/398 [00:00<00:00, 2308.48it/s, Materializing param=text_model.encoder.layers.2.self_attn.k_proj.bias] Loading weights: 11%|#1 | 44/398 [00:00<00:00, 2287.71it/s, Materializing param=text_model.encoder.layers.2.self_attn.k_proj.bias] Loading weights: 11%|#1 | 45/398 [00:00<00:00, 2300.27it/s, Materializing param=text_model.encoder.layers.2.self_attn.k_proj.weight] Loading weights: 11%|#1 | 45/398 [00:00<00:00, 2276.24it/s, Materializing param=text_model.encoder.layers.2.self_attn.k_proj.weight] Loading weights: 12%|#1 | 46/398 [00:00<00:00, 2295.90it/s, Materializing param=text_model.encoder.layers.2.self_attn.out_proj.bias] Loading weights: 12%|#1 | 46/398 [00:00<00:00, 2272.32it/s, Materializing param=text_model.encoder.layers.2.self_attn.out_proj.bias] Loading weights: 12%|#1 | 47/398 [00:00<00:00, 2295.84it/s, Materializing param=text_model.encoder.layers.2.self_attn.out_proj.weight] Loading weights: 12%|#1 | 47/398 [00:00<00:00, 2279.22it/s, Materializing param=text_model.encoder.layers.2.self_attn.out_proj.weight] Loading weights: 12%|#2 | 48/398 [00:00<00:00, 2307.47it/s, Materializing param=text_model.encoder.layers.2.self_attn.q_proj.bias] Loading weights: 12%|#2 | 48/398 [00:00<00:00, 2288.19it/s, Materializing param=text_model.encoder.layers.2.self_attn.q_proj.bias] Loading weights: 12%|#2 | 49/398 [00:00<00:00, 2308.60it/s, Materializing param=text_model.encoder.layers.2.self_attn.q_proj.weight] Loading weights: 12%|#2 | 49/398 [00:00<00:00, 2291.23it/s, Materializing param=text_model.encoder.layers.2.self_attn.q_proj.weight] Loading weights: 13%|#2 | 50/398 [00:00<00:00, 2318.58it/s, Materializing param=text_model.encoder.layers.2.self_attn.v_proj.bias] Loading weights: 13%|#2 | 50/398 [00:00<00:00, 2302.41it/s, Materializing param=text_model.encoder.layers.2.self_attn.v_proj.bias] Loading weights: 13%|#2 | 51/398 [00:00<00:00, 2323.29it/s, Materializing param=text_model.encoder.layers.2.self_attn.v_proj.weight] Loading weights: 13%|#2 | 51/398 [00:00<00:00, 2305.48it/s, Materializing param=text_model.encoder.layers.2.self_attn.v_proj.weight] Loading weights: 13%|#3 | 52/398 [00:00<00:00, 2332.34it/s, Materializing param=text_model.encoder.layers.3.layer_norm1.bias] Loading weights: 13%|#3 | 52/398 [00:00<00:00, 2320.62it/s, Materializing param=text_model.encoder.layers.3.layer_norm1.bias] Loading weights: 13%|#3 | 53/398 [00:00<00:00, 2349.03it/s, Materializing param=text_model.encoder.layers.3.layer_norm1.weight] Loading weights: 13%|#3 | 53/398 [00:00<00:00, 2337.45it/s, Materializing param=text_model.encoder.layers.3.layer_norm1.weight] Loading weights: 14%|#3 | 54/398 [00:00<00:00, 2366.07it/s, Materializing param=text_model.encoder.layers.3.layer_norm2.bias] Loading weights: 14%|#3 | 54/398 [00:00<00:00, 2348.68it/s, Materializing param=text_model.encoder.layers.3.layer_norm2.bias] Loading weights: 14%|#3 | 55/398 [00:00<00:00, 2363.52it/s, Materializing param=text_model.encoder.layers.3.layer_norm2.weight] Loading weights: 14%|#3 | 55/398 [00:00<00:00, 2346.31it/s, Materializing param=text_model.encoder.layers.3.layer_norm2.weight] Loading weights: 14%|#4 | 56/398 [00:00<00:00, 2370.50it/s, Materializing param=text_model.encoder.layers.3.mlp.fc1.bias] Loading weights: 14%|#4 | 56/398 [00:00<00:00, 2358.62it/s, Materializing param=text_model.encoder.layers.3.mlp.fc1.bias] Loading weights: 14%|#4 | 57/398 [00:00<00:00, 2382.75it/s, Materializing param=text_model.encoder.layers.3.mlp.fc1.weight] Loading weights: 14%|#4 | 57/398 [00:00<00:00, 2371.61it/s, Materializing param=text_model.encoder.layers.3.mlp.fc1.weight] Loading weights: 15%|#4 | 58/398 [00:00<00:00, 2394.65it/s, Materializing param=text_model.encoder.layers.3.mlp.fc2.bias] Loading weights: 15%|#4 | 58/398 [00:00<00:00, 2375.82it/s, Materializing param=text_model.encoder.layers.3.mlp.fc2.bias] Loading weights: 15%|#4 | 59/398 [00:00<00:00, 2389.25it/s, Materializing param=text_model.encoder.layers.3.mlp.fc2.weight] Loading weights: 15%|#4 | 59/398 [00:00<00:00, 2372.14it/s, Materializing param=text_model.encoder.layers.3.mlp.fc2.weight] Loading weights: 15%|#5 | 60/398 [00:00<00:00, 2381.44it/s, Materializing param=text_model.encoder.layers.3.self_attn.k_proj.bias] Loading weights: 15%|#5 | 60/398 [00:00<00:00, 2362.61it/s, Materializing param=text_model.encoder.layers.3.self_attn.k_proj.bias] Loading weights: 15%|#5 | 61/398 [00:00<00:00, 2371.88it/s, Materializing param=text_model.encoder.layers.3.self_attn.k_proj.weight] Loading weights: 15%|#5 | 61/398 [00:00<00:00, 2356.96it/s, Materializing param=text_model.encoder.layers.3.self_attn.k_proj.weight] Loading weights: 16%|#5 | 62/398 [00:00<00:00, 2363.55it/s, Materializing param=text_model.encoder.layers.3.self_attn.out_proj.bias] Loading weights: 16%|#5 | 62/398 [00:00<00:00, 2342.41it/s, Materializing param=text_model.encoder.layers.3.self_attn.out_proj.bias] Loading weights: 16%|#5 | 63/398 [00:00<00:00, 2358.83it/s, Materializing param=text_model.encoder.layers.3.self_attn.out_proj.weight] Loading weights: 16%|#5 | 63/398 [00:00<00:00, 2323.59it/s, Materializing param=text_model.encoder.layers.3.self_attn.out_proj.weight] Loading weights: 16%|#6 | 64/398 [00:00<00:00, 2321.22it/s, Materializing param=text_model.encoder.layers.3.self_attn.q_proj.bias] Loading weights: 16%|#6 | 64/398 [00:00<00:00, 2302.47it/s, Materializing param=text_model.encoder.layers.3.self_attn.q_proj.bias] Loading weights: 16%|#6 | 65/398 [00:00<00:00, 2309.36it/s, Materializing param=text_model.encoder.layers.3.self_attn.q_proj.weight] Loading weights: 16%|#6 | 65/398 [00:00<00:00, 2291.04it/s, Materializing param=text_model.encoder.layers.3.self_attn.q_proj.weight] Loading weights: 17%|#6 | 66/398 [00:00<00:00, 2302.26it/s, Materializing param=text_model.encoder.layers.3.self_attn.v_proj.bias] Loading weights: 17%|#6 | 66/398 [00:00<00:00, 2284.40it/s, Materializing param=text_model.encoder.layers.3.self_attn.v_proj.bias] Loading weights: 17%|#6 | 67/398 [00:00<00:00, 2299.41it/s, Materializing param=text_model.encoder.layers.3.self_attn.v_proj.weight] Loading weights: 17%|#6 | 67/398 [00:00<00:00, 2285.61it/s, Materializing param=text_model.encoder.layers.3.self_attn.v_proj.weight] Loading weights: 17%|#7 | 68/398 [00:00<00:00, 2300.97it/s, Materializing param=text_model.encoder.layers.4.layer_norm1.bias] Loading weights: 17%|#7 | 68/398 [00:00<00:00, 2287.83it/s, Materializing param=text_model.encoder.layers.4.layer_norm1.bias] Loading weights: 17%|#7 | 69/398 [00:00<00:00, 2304.21it/s, Materializing param=text_model.encoder.layers.4.layer_norm1.weight] Loading weights: 17%|#7 | 69/398 [00:00<00:00, 2291.28it/s, Materializing param=text_model.encoder.layers.4.layer_norm1.weight] Loading weights: 18%|#7 | 70/398 [00:00<00:00, 2306.74it/s, Materializing param=text_model.encoder.layers.4.layer_norm2.bias] Loading weights: 18%|#7 | 70/398 [00:00<00:00, 2293.38it/s, Materializing param=text_model.encoder.layers.4.layer_norm2.bias] Loading weights: 18%|#7 | 71/398 [00:00<00:00, 2308.96it/s, Materializing param=text_model.encoder.layers.4.layer_norm2.weight] Loading weights: 18%|#7 | 71/398 [00:00<00:00, 2297.38it/s, Materializing param=text_model.encoder.layers.4.layer_norm2.weight] Loading weights: 18%|#8 | 72/398 [00:00<00:00, 2315.94it/s, Materializing param=text_model.encoder.layers.4.mlp.fc1.bias] Loading weights: 18%|#8 | 72/398 [00:00<00:00, 2304.53it/s, Materializing param=text_model.encoder.layers.4.mlp.fc1.bias] Loading weights: 18%|#8 | 73/398 [00:00<00:00, 2322.53it/s, Materializing param=text_model.encoder.layers.4.mlp.fc1.weight] Loading weights: 18%|#8 | 73/398 [00:00<00:00, 2312.99it/s, Materializing param=text_model.encoder.layers.4.mlp.fc1.weight] Loading weights: 19%|#8 | 74/398 [00:00<00:00, 2330.38it/s, Materializing param=text_model.encoder.layers.4.mlp.fc2.bias] Loading weights: 19%|#8 | 74/398 [00:00<00:00, 2321.16it/s, Materializing param=text_model.encoder.layers.4.mlp.fc2.bias] Loading weights: 19%|#8 | 75/398 [00:00<00:00, 2335.27it/s, Materializing param=text_model.encoder.layers.4.mlp.fc2.weight] Loading weights: 19%|#8 | 75/398 [00:00<00:00, 2323.53it/s, Materializing param=text_model.encoder.layers.4.mlp.fc2.weight] Loading weights: 19%|#9 | 76/398 [00:00<00:00, 2338.34it/s, Materializing param=text_model.encoder.layers.4.self_attn.k_proj.bias] Loading weights: 19%|#9 | 76/398 [00:00<00:00, 2325.34it/s, Materializing param=text_model.encoder.layers.4.self_attn.k_proj.bias] Loading weights: 19%|#9 | 77/398 [00:00<00:00, 2340.55it/s, Materializing param=text_model.encoder.layers.4.self_attn.k_proj.weight] Loading weights: 19%|#9 | 77/398 [00:00<00:00, 2331.41it/s, Materializing param=text_model.encoder.layers.4.self_attn.k_proj.weight] Loading weights: 20%|#9 | 78/398 [00:00<00:00, 2346.56it/s, Materializing param=text_model.encoder.layers.4.self_attn.out_proj.bias] Loading weights: 20%|#9 | 78/398 [00:00<00:00, 2333.73it/s, Materializing param=text_model.encoder.layers.4.self_attn.out_proj.bias] Loading weights: 20%|#9 | 79/398 [00:00<00:00, 2347.09it/s, Materializing param=text_model.encoder.layers.4.self_attn.out_proj.weight] Loading weights: 20%|#9 | 79/398 [00:00<00:00, 2337.04it/s, Materializing param=text_model.encoder.layers.4.self_attn.out_proj.weight] Loading weights: 20%|## | 80/398 [00:00<00:00, 2353.64it/s, Materializing param=text_model.encoder.layers.4.self_attn.q_proj.bias] Loading weights: 20%|## | 80/398 [00:00<00:00, 2345.33it/s, Materializing param=text_model.encoder.layers.4.self_attn.q_proj.bias] Loading weights: 20%|## | 81/398 [00:00<00:00, 2363.55it/s, Materializing param=text_model.encoder.layers.4.self_attn.q_proj.weight] Loading weights: 20%|## | 81/398 [00:00<00:00, 2350.51it/s, Materializing param=text_model.encoder.layers.4.self_attn.q_proj.weight] Loading weights: 21%|## | 82/398 [00:00<00:00, 2362.65it/s, Materializing param=text_model.encoder.layers.4.self_attn.v_proj.bias] Loading weights: 21%|## | 82/398 [00:00<00:00, 2352.02it/s, Materializing param=text_model.encoder.layers.4.self_attn.v_proj.bias] Loading weights: 21%|## | 83/398 [00:00<00:00, 2367.15it/s, Materializing param=text_model.encoder.layers.4.self_attn.v_proj.weight] Loading weights: 21%|## | 83/398 [00:00<00:00, 2357.83it/s, Materializing param=text_model.encoder.layers.4.self_attn.v_proj.weight] Loading weights: 21%|##1 | 84/398 [00:00<00:00, 2374.44it/s, Materializing param=text_model.encoder.layers.5.layer_norm1.bias] Loading weights: 21%|##1 | 84/398 [00:00<00:00, 2365.81it/s, Materializing param=text_model.encoder.layers.5.layer_norm1.bias] Loading weights: 21%|##1 | 85/398 [00:00<00:00, 2383.14it/s, Materializing param=text_model.encoder.layers.5.layer_norm1.weight] Loading weights: 21%|##1 | 85/398 [00:00<00:00, 2375.47it/s, Materializing param=text_model.encoder.layers.5.layer_norm1.weight] Loading weights: 22%|##1 | 86/398 [00:00<00:00, 2393.17it/s, Materializing param=text_model.encoder.layers.5.layer_norm2.bias] Loading weights: 22%|##1 | 86/398 [00:00<00:00, 2385.84it/s, Materializing param=text_model.encoder.layers.5.layer_norm2.bias] Loading weights: 22%|##1 | 87/398 [00:00<00:00, 2400.25it/s, Materializing param=text_model.encoder.layers.5.layer_norm2.weight] Loading weights: 22%|##1 | 87/398 [00:00<00:00, 2389.10it/s, Materializing param=text_model.encoder.layers.5.layer_norm2.weight] Loading weights: 22%|##2 | 88/398 [00:00<00:00, 2398.18it/s, Materializing param=text_model.encoder.layers.5.mlp.fc1.bias] Loading weights: 22%|##2 | 88/398 [00:00<00:00, 2384.61it/s, Materializing param=text_model.encoder.layers.5.mlp.fc1.bias] Loading weights: 22%|##2 | 89/398 [00:00<00:00, 2393.27it/s, Materializing param=text_model.encoder.layers.5.mlp.fc1.weight] Loading weights: 22%|##2 | 89/398 [00:00<00:00, 2383.31it/s, Materializing param=text_model.encoder.layers.5.mlp.fc1.weight] Loading weights: 23%|##2 | 90/398 [00:00<00:00, 2398.60it/s, Materializing param=text_model.encoder.layers.5.mlp.fc2.bias] Loading weights: 23%|##2 | 90/398 [00:00<00:00, 2389.75it/s, Materializing param=text_model.encoder.layers.5.mlp.fc2.bias] Loading weights: 23%|##2 | 91/398 [00:00<00:00, 2401.59it/s, Materializing param=text_model.encoder.layers.5.mlp.fc2.weight] Loading weights: 23%|##2 | 91/398 [00:00<00:00, 2388.59it/s, Materializing param=text_model.encoder.layers.5.mlp.fc2.weight] Loading weights: 23%|##3 | 92/398 [00:00<00:00, 2399.25it/s, Materializing param=text_model.encoder.layers.5.self_attn.k_proj.bias] Loading weights: 23%|##3 | 92/398 [00:00<00:00, 2389.40it/s, Materializing param=text_model.encoder.layers.5.self_attn.k_proj.bias] Loading weights: 23%|##3 | 93/398 [00:00<00:00, 2403.55it/s, Materializing param=text_model.encoder.layers.5.self_attn.k_proj.weight] Loading weights: 23%|##3 | 93/398 [00:00<00:00, 2395.51it/s, Materializing param=text_model.encoder.layers.5.self_attn.k_proj.weight] Loading weights: 24%|##3 | 94/398 [00:00<00:00, 2406.11it/s, Materializing param=text_model.encoder.layers.5.self_attn.out_proj.bias] Loading weights: 24%|##3 | 94/398 [00:00<00:00, 2396.29it/s, Materializing param=text_model.encoder.layers.5.self_attn.out_proj.bias] Loading weights: 24%|##3 | 95/398 [00:00<00:00, 2409.91it/s, Materializing param=text_model.encoder.layers.5.self_attn.out_proj.weight] Loading weights: 24%|##3 | 95/398 [00:00<00:00, 2402.06it/s, Materializing param=text_model.encoder.layers.5.self_attn.out_proj.weight] Loading weights: 24%|##4 | 96/398 [00:00<00:00, 2416.99it/s, Materializing param=text_model.encoder.layers.5.self_attn.q_proj.bias] Loading weights: 24%|##4 | 96/398 [00:00<00:00, 2408.13it/s, Materializing param=text_model.encoder.layers.5.self_attn.q_proj.bias] Loading weights: 24%|##4 | 97/398 [00:00<00:00, 2422.24it/s, Materializing param=text_model.encoder.layers.5.self_attn.q_proj.weight] Loading weights: 24%|##4 | 97/398 [00:00<00:00, 2414.94it/s, Materializing param=text_model.encoder.layers.5.self_attn.q_proj.weight] Loading weights: 25%|##4 | 98/398 [00:00<00:00, 2430.22it/s, Materializing param=text_model.encoder.layers.5.self_attn.v_proj.bias] Loading weights: 25%|##4 | 98/398 [00:00<00:00, 2422.92it/s, Materializing param=text_model.encoder.layers.5.self_attn.v_proj.bias] Loading weights: 25%|##4 | 99/398 [00:00<00:00, 2438.26it/s, Materializing param=text_model.encoder.layers.5.self_attn.v_proj.weight] Loading weights: 25%|##4 | 99/398 [00:00<00:00, 2431.07it/s, Materializing param=text_model.encoder.layers.5.self_attn.v_proj.weight] Loading weights: 25%|##5 | 100/398 [00:00<00:00, 2446.13it/s, Materializing param=text_model.encoder.layers.6.layer_norm1.bias] Loading weights: 25%|##5 | 100/398 [00:00<00:00, 2439.30it/s, Materializing param=text_model.encoder.layers.6.layer_norm1.bias] Loading weights: 25%|##5 | 101/398 [00:00<00:00, 2451.66it/s, Materializing param=text_model.encoder.layers.6.layer_norm1.weight] Loading weights: 25%|##5 | 101/398 [00:00<00:00, 2441.87it/s, Materializing param=text_model.encoder.layers.6.layer_norm1.weight] Loading weights: 26%|##5 | 102/398 [00:00<00:00, 2452.06it/s, Materializing param=text_model.encoder.layers.6.layer_norm2.bias] Loading weights: 26%|##5 | 102/398 [00:00<00:00, 2442.66it/s, Materializing param=text_model.encoder.layers.6.layer_norm2.bias] Loading weights: 26%|##5 | 103/398 [00:00<00:00, 2451.92it/s, Materializing param=text_model.encoder.layers.6.layer_norm2.weight] Loading weights: 26%|##5 | 103/398 [00:00<00:00, 2442.42it/s, Materializing param=text_model.encoder.layers.6.layer_norm2.weight] Loading weights: 26%|##6 | 104/398 [00:00<00:00, 2452.06it/s, Materializing param=text_model.encoder.layers.6.mlp.fc1.bias] Loading weights: 26%|##6 | 104/398 [00:00<00:00, 2443.19it/s, Materializing param=text_model.encoder.layers.6.mlp.fc1.bias] Loading weights: 26%|##6 | 105/398 [00:00<00:00, 2455.11it/s, Materializing param=text_model.encoder.layers.6.mlp.fc1.weight] Loading weights: 26%|##6 | 105/398 [00:00<00:00, 2448.17it/s, Materializing param=text_model.encoder.layers.6.mlp.fc1.weight] Loading weights: 27%|##6 | 106/398 [00:00<00:00, 2461.34it/s, Materializing param=text_model.encoder.layers.6.mlp.fc2.bias] Loading weights: 27%|##6 | 106/398 [00:00<00:00, 2454.37it/s, Materializing param=text_model.encoder.layers.6.mlp.fc2.bias] Loading weights: 27%|##6 | 107/398 [00:00<00:00, 2465.23it/s, Materializing param=text_model.encoder.layers.6.mlp.fc2.weight] Loading weights: 27%|##6 | 107/398 [00:00<00:00, 2456.19it/s, Materializing param=text_model.encoder.layers.6.mlp.fc2.weight] Loading weights: 27%|##7 | 108/398 [00:00<00:00, 2468.08it/s, Materializing param=text_model.encoder.layers.6.self_attn.k_proj.bias] Loading weights: 27%|##7 | 108/398 [00:00<00:00, 2455.51it/s, Materializing param=text_model.encoder.layers.6.self_attn.k_proj.bias] Loading weights: 27%|##7 | 109/398 [00:00<00:00, 2462.36it/s, Materializing param=text_model.encoder.layers.6.self_attn.k_proj.weight] Loading weights: 27%|##7 | 109/398 [00:00<00:00, 2451.53it/s, Materializing param=text_model.encoder.layers.6.self_attn.k_proj.weight] Loading weights: 28%|##7 | 110/398 [00:00<00:00, 2456.53it/s, Materializing param=text_model.encoder.layers.6.self_attn.out_proj.bias] Loading weights: 28%|##7 | 110/398 [00:00<00:00, 2442.28it/s, Materializing param=text_model.encoder.layers.6.self_attn.out_proj.bias] Loading weights: 28%|##7 | 111/398 [00:00<00:00, 2447.97it/s, Materializing param=text_model.encoder.layers.6.self_attn.out_proj.weight] Loading weights: 28%|##7 | 111/398 [00:00<00:00, 2440.25it/s, Materializing param=text_model.encoder.layers.6.self_attn.out_proj.weight] Loading weights: 28%|##8 | 112/398 [00:00<00:00, 2452.49it/s, Materializing param=text_model.encoder.layers.6.self_attn.q_proj.bias] Loading weights: 28%|##8 | 112/398 [00:00<00:00, 2441.82it/s, Materializing param=text_model.encoder.layers.6.self_attn.q_proj.bias] Loading weights: 28%|##8 | 113/398 [00:00<00:00, 2449.80it/s, Materializing param=text_model.encoder.layers.6.self_attn.q_proj.weight] Loading weights: 28%|##8 | 113/398 [00:00<00:00, 2438.06it/s, Materializing param=text_model.encoder.layers.6.self_attn.q_proj.weight] Loading weights: 29%|##8 | 114/398 [00:00<00:00, 2443.48it/s, Materializing param=text_model.encoder.layers.6.self_attn.v_proj.bias] Loading weights: 29%|##8 | 114/398 [00:00<00:00, 2430.78it/s, Materializing param=text_model.encoder.layers.6.self_attn.v_proj.bias] Loading weights: 29%|##8 | 115/398 [00:00<00:00, 2438.66it/s, Materializing param=text_model.encoder.layers.6.self_attn.v_proj.weight] Loading weights: 29%|##8 | 115/398 [00:00<00:00, 2431.01it/s, Materializing param=text_model.encoder.layers.6.self_attn.v_proj.weight] Loading weights: 29%|##9 | 116/398 [00:00<00:00, 2439.91it/s, Materializing param=text_model.encoder.layers.7.layer_norm1.bias] Loading weights: 29%|##9 | 116/398 [00:00<00:00, 2433.24it/s, Materializing param=text_model.encoder.layers.7.layer_norm1.bias] Loading weights: 29%|##9 | 117/398 [00:00<00:00, 2442.44it/s, Materializing param=text_model.encoder.layers.7.layer_norm1.weight] Loading weights: 29%|##9 | 117/398 [00:00<00:00, 2426.62it/s, Materializing param=text_model.encoder.layers.7.layer_norm1.weight] Loading weights: 30%|##9 | 118/398 [00:00<00:00, 2433.88it/s, Materializing param=text_model.encoder.layers.7.layer_norm2.bias] Loading weights: 30%|##9 | 118/398 [00:00<00:00, 2425.62it/s, Materializing param=text_model.encoder.layers.7.layer_norm2.bias] Loading weights: 30%|##9 | 119/398 [00:00<00:00, 2437.01it/s, Materializing param=text_model.encoder.layers.7.layer_norm2.weight] Loading weights: 30%|##9 | 119/398 [00:00<00:00, 2430.90it/s, Materializing param=text_model.encoder.layers.7.layer_norm2.weight] Loading weights: 30%|### | 120/398 [00:00<00:00, 2443.21it/s, Materializing param=text_model.encoder.layers.7.mlp.fc1.bias] Loading weights: 30%|### | 120/398 [00:00<00:00, 2434.87it/s, Materializing param=text_model.encoder.layers.7.mlp.fc1.bias] Loading weights: 30%|### | 121/398 [00:00<00:00, 2446.81it/s, Materializing param=text_model.encoder.layers.7.mlp.fc1.weight] Loading weights: 30%|### | 121/398 [00:00<00:00, 2441.11it/s, Materializing param=text_model.encoder.layers.7.mlp.fc1.weight] Loading weights: 31%|### | 122/398 [00:00<00:00, 2453.36it/s, Materializing param=text_model.encoder.layers.7.mlp.fc2.bias] Loading weights: 31%|### | 122/398 [00:00<00:00, 2447.28it/s, Materializing param=text_model.encoder.layers.7.mlp.fc2.bias] Loading weights: 31%|### | 123/398 [00:00<00:00, 2459.29it/s, Materializing param=text_model.encoder.layers.7.mlp.fc2.weight] Loading weights: 31%|### | 123/398 [00:00<00:00, 2450.75it/s, Materializing param=text_model.encoder.layers.7.mlp.fc2.weight] Loading weights: 31%|###1 | 124/398 [00:00<00:00, 2457.11it/s, Materializing param=text_model.encoder.layers.7.self_attn.k_proj.bias] Loading weights: 31%|###1 | 124/398 [00:00<00:00, 2450.54it/s, Materializing param=text_model.encoder.layers.7.self_attn.k_proj.bias] Loading weights: 31%|###1 | 125/398 [00:00<00:00, 2461.94it/s, Materializing param=text_model.encoder.layers.7.self_attn.k_proj.weight] Loading weights: 31%|###1 | 125/398 [00:00<00:00, 2453.69it/s, Materializing param=text_model.encoder.layers.7.self_attn.k_proj.weight] Loading weights: 32%|###1 | 126/398 [00:00<00:00, 2464.98it/s, Materializing param=text_model.encoder.layers.7.self_attn.out_proj.bias] Loading weights: 32%|###1 | 126/398 [00:00<00:00, 2456.97it/s, Materializing param=text_model.encoder.layers.7.self_attn.out_proj.bias] Loading weights: 32%|###1 | 127/398 [00:00<00:00, 2462.65it/s, Materializing param=text_model.encoder.layers.7.self_attn.out_proj.weight] Loading weights: 32%|###1 | 127/398 [00:00<00:00, 2452.99it/s, Materializing param=text_model.encoder.layers.7.self_attn.out_proj.weight] Loading weights: 32%|###2 | 128/398 [00:00<00:00, 2461.74it/s, Materializing param=text_model.encoder.layers.7.self_attn.q_proj.bias] Loading weights: 32%|###2 | 128/398 [00:00<00:00, 2454.01it/s, Materializing param=text_model.encoder.layers.7.self_attn.q_proj.bias] Loading weights: 32%|###2 | 129/398 [00:00<00:00, 2465.29it/s, Materializing param=text_model.encoder.layers.7.self_attn.q_proj.weight] Loading weights: 32%|###2 | 129/398 [00:00<00:00, 2456.68it/s, Materializing param=text_model.encoder.layers.7.self_attn.q_proj.weight] Loading weights: 33%|###2 | 130/398 [00:00<00:00, 2461.39it/s, Materializing param=text_model.encoder.layers.7.self_attn.v_proj.bias] Loading weights: 33%|###2 | 130/398 [00:00<00:00, 2452.91it/s, Materializing param=text_model.encoder.layers.7.self_attn.v_proj.bias] Loading weights: 33%|###2 | 131/398 [00:00<00:00, 2462.91it/s, Materializing param=text_model.encoder.layers.7.self_attn.v_proj.weight] Loading weights: 33%|###2 | 131/398 [00:00<00:00, 2457.23it/s, Materializing param=text_model.encoder.layers.7.self_attn.v_proj.weight] Loading weights: 33%|###3 | 132/398 [00:00<00:00, 2468.22it/s, Materializing param=text_model.encoder.layers.8.layer_norm1.bias] Loading weights: 33%|###3 | 132/398 [00:00<00:00, 2462.98it/s, Materializing param=text_model.encoder.layers.8.layer_norm1.bias] Loading weights: 33%|###3 | 133/398 [00:00<00:00, 2474.39it/s, Materializing param=text_model.encoder.layers.8.layer_norm1.weight] Loading weights: 33%|###3 | 133/398 [00:00<00:00, 2469.31it/s, Materializing param=text_model.encoder.layers.8.layer_norm1.weight] Loading weights: 34%|###3 | 134/398 [00:00<00:00, 2481.16it/s, Materializing param=text_model.encoder.layers.8.layer_norm2.bias] Loading weights: 34%|###3 | 134/398 [00:00<00:00, 2475.82it/s, Materializing param=text_model.encoder.layers.8.layer_norm2.bias] Loading weights: 34%|###3 | 135/398 [00:00<00:00, 2484.30it/s, Materializing param=text_model.encoder.layers.8.layer_norm2.weight] Loading weights: 34%|###3 | 135/398 [00:00<00:00, 2471.09it/s, Materializing param=text_model.encoder.layers.8.layer_norm2.weight] Loading weights: 34%|###4 | 136/398 [00:00<00:00, 2481.08it/s, Materializing param=text_model.encoder.layers.8.mlp.fc1.bias] Loading weights: 34%|###4 | 136/398 [00:00<00:00, 2475.62it/s, Materializing param=text_model.encoder.layers.8.mlp.fc1.bias] Loading weights: 34%|###4 | 137/398 [00:00<00:00, 2485.87it/s, Materializing param=text_model.encoder.layers.8.mlp.fc1.weight] Loading weights: 34%|###4 | 137/398 [00:00<00:00, 2480.75it/s, Materializing param=text_model.encoder.layers.8.mlp.fc1.weight] Loading weights: 35%|###4 | 138/398 [00:00<00:00, 2489.45it/s, Materializing param=text_model.encoder.layers.8.mlp.fc2.bias] Loading weights: 35%|###4 | 138/398 [00:00<00:00, 2481.90it/s, Materializing param=text_model.encoder.layers.8.mlp.fc2.bias] Loading weights: 35%|###4 | 139/398 [00:00<00:00, 2488.39it/s, Materializing param=text_model.encoder.layers.8.mlp.fc2.weight] Loading weights: 35%|###4 | 139/398 [00:00<00:00, 2479.01it/s, Materializing param=text_model.encoder.layers.8.mlp.fc2.weight] Loading weights: 35%|###5 | 140/398 [00:00<00:00, 2486.38it/s, Materializing param=text_model.encoder.layers.8.self_attn.k_proj.bias] Loading weights: 35%|###5 | 140/398 [00:00<00:00, 2478.87it/s, Materializing param=text_model.encoder.layers.8.self_attn.k_proj.bias] Loading weights: 35%|###5 | 141/398 [00:00<00:00, 2488.24it/s, Materializing param=text_model.encoder.layers.8.self_attn.k_proj.weight] Loading weights: 35%|###5 | 141/398 [00:00<00:00, 2482.15it/s, Materializing param=text_model.encoder.layers.8.self_attn.k_proj.weight] Loading weights: 36%|###5 | 142/398 [00:00<00:00, 2491.04it/s, Materializing param=text_model.encoder.layers.8.self_attn.out_proj.bias] Loading weights: 36%|###5 | 142/398 [00:00<00:00, 2483.97it/s, Materializing param=text_model.encoder.layers.8.self_attn.out_proj.bias] Loading weights: 36%|###5 | 143/398 [00:00<00:00, 2493.20it/s, Materializing param=text_model.encoder.layers.8.self_attn.out_proj.weight] Loading weights: 36%|###5 | 143/398 [00:00<00:00, 2485.76it/s, Materializing param=text_model.encoder.layers.8.self_attn.out_proj.weight] Loading weights: 36%|###6 | 144/398 [00:00<00:00, 2486.57it/s, Materializing param=text_model.encoder.layers.8.self_attn.q_proj.bias] Loading weights: 36%|###6 | 144/398 [00:00<00:00, 2477.83it/s, Materializing param=text_model.encoder.layers.8.self_attn.q_proj.bias] Loading weights: 36%|###6 | 145/398 [00:00<00:00, 2484.16it/s, Materializing param=text_model.encoder.layers.8.self_attn.q_proj.weight] Loading weights: 36%|###6 | 145/398 [00:00<00:00, 2476.32it/s, Materializing param=text_model.encoder.layers.8.self_attn.q_proj.weight] Loading weights: 37%|###6 | 146/398 [00:00<00:00, 2484.89it/s, Materializing param=text_model.encoder.layers.8.self_attn.v_proj.bias] Loading weights: 37%|###6 | 146/398 [00:00<00:00, 2479.17it/s, Materializing param=text_model.encoder.layers.8.self_attn.v_proj.bias] Loading weights: 37%|###6 | 147/398 [00:00<00:00, 2488.57it/s, Materializing param=text_model.encoder.layers.8.self_attn.v_proj.weight] Loading weights: 37%|###6 | 147/398 [00:00<00:00, 2483.10it/s, Materializing param=text_model.encoder.layers.8.self_attn.v_proj.weight] Loading weights: 37%|###7 | 148/398 [00:00<00:00, 2492.92it/s, Materializing param=text_model.encoder.layers.9.layer_norm1.bias] Loading weights: 37%|###7 | 148/398 [00:00<00:00, 2487.82it/s, Materializing param=text_model.encoder.layers.9.layer_norm1.bias] Loading weights: 37%|###7 | 149/398 [00:00<00:00, 2497.35it/s, Materializing param=text_model.encoder.layers.9.layer_norm1.weight] Loading weights: 37%|###7 | 149/398 [00:00<00:00, 2491.93it/s, Materializing param=text_model.encoder.layers.9.layer_norm1.weight] Loading weights: 38%|###7 | 150/398 [00:00<00:00, 2499.04it/s, Materializing param=text_model.encoder.layers.9.layer_norm2.bias] Loading weights: 38%|###7 | 150/398 [00:00<00:00, 2493.66it/s, Materializing param=text_model.encoder.layers.9.layer_norm2.bias] Loading weights: 38%|###7 | 151/398 [00:00<00:00, 2499.64it/s, Materializing param=text_model.encoder.layers.9.layer_norm2.weight] Loading weights: 38%|###7 | 151/398 [00:00<00:00, 2489.90it/s, Materializing param=text_model.encoder.layers.9.layer_norm2.weight] Loading weights: 38%|###8 | 152/398 [00:00<00:00, 2492.50it/s, Materializing param=text_model.encoder.layers.9.mlp.fc1.bias] Loading weights: 38%|###8 | 152/398 [00:00<00:00, 2482.47it/s, Materializing param=text_model.encoder.layers.9.mlp.fc1.bias] Loading weights: 38%|###8 | 153/398 [00:00<00:00, 2485.98it/s, Materializing param=text_model.encoder.layers.9.mlp.fc1.weight] Loading weights: 38%|###8 | 153/398 [00:00<00:00, 2479.15it/s, Materializing param=text_model.encoder.layers.9.mlp.fc1.weight] Loading weights: 39%|###8 | 154/398 [00:00<00:00, 2485.26it/s, Materializing param=text_model.encoder.layers.9.mlp.fc2.bias] Loading weights: 39%|###8 | 154/398 [00:00<00:00, 2478.93it/s, Materializing param=text_model.encoder.layers.9.mlp.fc2.bias] Loading weights: 39%|###8 | 155/398 [00:00<00:00, 2487.53it/s, Materializing param=text_model.encoder.layers.9.mlp.fc2.weight] Loading weights: 39%|###8 | 155/398 [00:00<00:00, 2482.22it/s, Materializing param=text_model.encoder.layers.9.mlp.fc2.weight] Loading weights: 39%|###9 | 156/398 [00:00<00:00, 2488.00it/s, Materializing param=text_model.encoder.layers.9.self_attn.k_proj.bias] Loading weights: 39%|###9 | 156/398 [00:00<00:00, 2479.24it/s, Materializing param=text_model.encoder.layers.9.self_attn.k_proj.bias] Loading weights: 39%|###9 | 157/398 [00:00<00:00, 2485.35it/s, Materializing param=text_model.encoder.layers.9.self_attn.k_proj.weight] Loading weights: 39%|###9 | 157/398 [00:00<00:00, 2479.10it/s, Materializing param=text_model.encoder.layers.9.self_attn.k_proj.weight] Loading weights: 40%|###9 | 158/398 [00:00<00:00, 2487.12it/s, Materializing param=text_model.encoder.layers.9.self_attn.out_proj.bias] Loading weights: 40%|###9 | 158/398 [00:00<00:00, 2482.35it/s, Materializing param=text_model.encoder.layers.9.self_attn.out_proj.bias] Loading weights: 40%|###9 | 159/398 [00:00<00:00, 2491.65it/s, Materializing param=text_model.encoder.layers.9.self_attn.out_proj.weight] Loading weights: 40%|###9 | 159/398 [00:00<00:00, 2486.80it/s, Materializing param=text_model.encoder.layers.9.self_attn.out_proj.weight] Loading weights: 40%|#### | 160/398 [00:00<00:00, 2495.45it/s, Materializing param=text_model.encoder.layers.9.self_attn.q_proj.bias] Loading weights: 40%|#### | 160/398 [00:00<00:00, 2490.62it/s, Materializing param=text_model.encoder.layers.9.self_attn.q_proj.bias] Loading weights: 40%|#### | 161/398 [00:00<00:00, 2499.84it/s, Materializing param=text_model.encoder.layers.9.self_attn.q_proj.weight] Loading weights: 40%|#### | 161/398 [00:00<00:00, 2493.75it/s, Materializing param=text_model.encoder.layers.9.self_attn.q_proj.weight] Loading weights: 41%|#### | 162/398 [00:00<00:00, 2499.32it/s, Materializing param=text_model.encoder.layers.9.self_attn.v_proj.bias] Loading weights: 41%|#### | 162/398 [00:00<00:00, 2493.39it/s, Materializing param=text_model.encoder.layers.9.self_attn.v_proj.bias] Loading weights: 41%|#### | 163/398 [00:00<00:00, 2501.40it/s, Materializing param=text_model.encoder.layers.9.self_attn.v_proj.weight] Loading weights: 41%|#### | 163/398 [00:00<00:00, 2496.58it/s, Materializing param=text_model.encoder.layers.9.self_attn.v_proj.weight] Loading weights: 41%|####1 | 164/398 [00:00<00:00, 2505.85it/s, Materializing param=text_model.encoder.layers.10.layer_norm1.bias] Loading weights: 41%|####1 | 164/398 [00:00<00:00, 2500.34it/s, Materializing param=text_model.encoder.layers.10.layer_norm1.bias] Loading weights: 41%|####1 | 165/398 [00:00<00:00, 2501.18it/s, Materializing param=text_model.encoder.layers.10.layer_norm1.weight] Loading weights: 41%|####1 | 165/398 [00:00<00:00, 2493.25it/s, Materializing param=text_model.encoder.layers.10.layer_norm1.weight] Loading weights: 42%|####1 | 166/398 [00:00<00:00, 2497.50it/s, Materializing param=text_model.encoder.layers.10.layer_norm2.bias] Loading weights: 42%|####1 | 166/398 [00:00<00:00, 2490.84it/s, Materializing param=text_model.encoder.layers.10.layer_norm2.bias] Loading weights: 42%|####1 | 167/398 [00:00<00:00, 2498.20it/s, Materializing param=text_model.encoder.layers.10.layer_norm2.weight] Loading weights: 42%|####1 | 167/398 [00:00<00:00, 2493.44it/s, Materializing param=text_model.encoder.layers.10.layer_norm2.weight] Loading weights: 42%|####2 | 168/398 [00:00<00:00, 2502.38it/s, Materializing param=text_model.encoder.layers.10.mlp.fc1.bias] Loading weights: 42%|####2 | 168/398 [00:00<00:00, 2497.94it/s, Materializing param=text_model.encoder.layers.10.mlp.fc1.bias] Loading weights: 42%|####2 | 169/398 [00:00<00:00, 2502.89it/s, Materializing param=text_model.encoder.layers.10.mlp.fc1.weight] Loading weights: 42%|####2 | 169/398 [00:00<00:00, 2493.71it/s, Materializing param=text_model.encoder.layers.10.mlp.fc1.weight] Loading weights: 43%|####2 | 170/398 [00:00<00:00, 2500.16it/s, Materializing param=text_model.encoder.layers.10.mlp.fc2.bias] Loading weights: 43%|####2 | 170/398 [00:00<00:00, 2495.46it/s, Materializing param=text_model.encoder.layers.10.mlp.fc2.bias] Loading weights: 43%|####2 | 171/398 [00:00<00:00, 2504.23it/s, Materializing param=text_model.encoder.layers.10.mlp.fc2.weight] Loading weights: 43%|####2 | 171/398 [00:00<00:00, 2500.13it/s, Materializing param=text_model.encoder.layers.10.mlp.fc2.weight] Loading weights: 43%|####3 | 172/398 [00:00<00:00, 2509.24it/s, Materializing param=text_model.encoder.layers.10.self_attn.k_proj.bias] Loading weights: 43%|####3 | 172/398 [00:00<00:00, 2505.12it/s, Materializing param=text_model.encoder.layers.10.self_attn.k_proj.bias] Loading weights: 43%|####3 | 173/398 [00:00<00:00, 2513.94it/s, Materializing param=text_model.encoder.layers.10.self_attn.k_proj.weight] Loading weights: 43%|####3 | 173/398 [00:00<00:00, 2509.82it/s, Materializing param=text_model.encoder.layers.10.self_attn.k_proj.weight] Loading weights: 44%|####3 | 174/398 [00:00<00:00, 2518.83it/s, Materializing param=text_model.encoder.layers.10.self_attn.out_proj.bias] Loading weights: 44%|####3 | 174/398 [00:00<00:00, 2514.95it/s, Materializing param=text_model.encoder.layers.10.self_attn.out_proj.bias] Loading weights: 44%|####3 | 175/398 [00:00<00:00, 2523.84it/s, Materializing param=text_model.encoder.layers.10.self_attn.out_proj.weight] Loading weights: 44%|####3 | 175/398 [00:00<00:00, 2517.50it/s, Materializing param=text_model.encoder.layers.10.self_attn.out_proj.weight] Loading weights: 44%|####4 | 176/398 [00:00<00:00, 2522.73it/s, Materializing param=text_model.encoder.layers.10.self_attn.q_proj.bias] Loading weights: 44%|####4 | 176/398 [00:00<00:00, 2516.29it/s, Materializing param=text_model.encoder.layers.10.self_attn.q_proj.bias] Loading weights: 44%|####4 | 177/398 [00:00<00:00, 2523.68it/s, Materializing param=text_model.encoder.layers.10.self_attn.q_proj.weight] Loading weights: 44%|####4 | 177/398 [00:00<00:00, 2518.90it/s, Materializing param=text_model.encoder.layers.10.self_attn.q_proj.weight] Loading weights: 45%|####4 | 178/398 [00:00<00:00, 2527.28it/s, Materializing param=text_model.encoder.layers.10.self_attn.v_proj.bias] Loading weights: 45%|####4 | 178/398 [00:00<00:00, 2523.19it/s, Materializing param=text_model.encoder.layers.10.self_attn.v_proj.bias] Loading weights: 45%|####4 | 179/398 [00:00<00:00, 2531.86it/s, Materializing param=text_model.encoder.layers.10.self_attn.v_proj.weight] Loading weights: 45%|####4 | 179/398 [00:00<00:00, 2525.21it/s, Materializing param=text_model.encoder.layers.10.self_attn.v_proj.weight] Loading weights: 45%|####5 | 180/398 [00:00<00:00, 2529.03it/s, Materializing param=text_model.encoder.layers.11.layer_norm1.bias] Loading weights: 45%|####5 | 180/398 [00:00<00:00, 2523.03it/s, Materializing param=text_model.encoder.layers.11.layer_norm1.bias] Loading weights: 45%|####5 | 181/398 [00:00<00:00, 2529.83it/s, Materializing param=text_model.encoder.layers.11.layer_norm1.weight] Loading weights: 45%|####5 | 181/398 [00:00<00:00, 2524.15it/s, Materializing param=text_model.encoder.layers.11.layer_norm1.weight] Loading weights: 46%|####5 | 182/398 [00:00<00:00, 2531.56it/s, Materializing param=text_model.encoder.layers.11.layer_norm2.bias] Loading weights: 46%|####5 | 182/398 [00:00<00:00, 2526.54it/s, Materializing param=text_model.encoder.layers.11.layer_norm2.bias] Loading weights: 46%|####5 | 183/398 [00:00<00:00, 2530.30it/s, Materializing param=text_model.encoder.layers.11.layer_norm2.weight] Loading weights: 46%|####5 | 183/398 [00:00<00:00, 2524.74it/s, Materializing param=text_model.encoder.layers.11.layer_norm2.weight] Loading weights: 46%|####6 | 184/398 [00:00<00:00, 2530.33it/s, Materializing param=text_model.encoder.layers.11.mlp.fc1.bias] Loading weights: 46%|####6 | 184/398 [00:00<00:00, 2524.23it/s, Materializing param=text_model.encoder.layers.11.mlp.fc1.bias] Loading weights: 46%|####6 | 185/398 [00:00<00:00, 2531.21it/s, Materializing param=text_model.encoder.layers.11.mlp.fc1.weight] Loading weights: 46%|####6 | 185/398 [00:00<00:00, 2526.93it/s, Materializing param=text_model.encoder.layers.11.mlp.fc1.weight] Loading weights: 47%|####6 | 186/398 [00:00<00:00, 2534.94it/s, Materializing param=text_model.encoder.layers.11.mlp.fc2.bias] Loading weights: 47%|####6 | 186/398 [00:00<00:00, 2530.75it/s, Materializing param=text_model.encoder.layers.11.mlp.fc2.bias] Loading weights: 47%|####6 | 187/398 [00:00<00:00, 2538.78it/s, Materializing param=text_model.encoder.layers.11.mlp.fc2.weight] Loading weights: 47%|####6 | 187/398 [00:00<00:00, 2534.64it/s, Materializing param=text_model.encoder.layers.11.mlp.fc2.weight] Loading weights: 47%|####7 | 188/398 [00:00<00:00, 2542.55it/s, Materializing param=text_model.encoder.layers.11.self_attn.k_proj.bias] Loading weights: 47%|####7 | 188/398 [00:00<00:00, 2538.03it/s, Materializing param=text_model.encoder.layers.11.self_attn.k_proj.bias] Loading weights: 47%|####7 | 189/398 [00:00<00:00, 2543.20it/s, Materializing param=text_model.encoder.layers.11.self_attn.k_proj.weight] Loading weights: 47%|####7 | 189/398 [00:00<00:00, 2537.16it/s, Materializing param=text_model.encoder.layers.11.self_attn.k_proj.weight] Loading weights: 48%|####7 | 190/398 [00:00<00:00, 2544.44it/s, Materializing param=text_model.encoder.layers.11.self_attn.out_proj.bias] Loading weights: 48%|####7 | 190/398 [00:00<00:00, 2540.11it/s, Materializing param=text_model.encoder.layers.11.self_attn.out_proj.bias] Loading weights: 48%|####7 | 191/398 [00:00<00:00, 2546.19it/s, Materializing param=text_model.encoder.layers.11.self_attn.out_proj.weight] Loading weights: 48%|####7 | 191/398 [00:00<00:00, 2540.14it/s, Materializing param=text_model.encoder.layers.11.self_attn.out_proj.weight] Loading weights: 48%|####8 | 192/398 [00:00<00:00, 2546.75it/s, Materializing param=text_model.encoder.layers.11.self_attn.q_proj.bias] Loading weights: 48%|####8 | 192/398 [00:00<00:00, 2542.51it/s, Materializing param=text_model.encoder.layers.11.self_attn.q_proj.bias] Loading weights: 48%|####8 | 193/398 [00:00<00:00, 2546.00it/s, Materializing param=text_model.encoder.layers.11.self_attn.q_proj.weight] Loading weights: 48%|####8 | 193/398 [00:00<00:00, 2540.44it/s, Materializing param=text_model.encoder.layers.11.self_attn.q_proj.weight] Loading weights: 49%|####8 | 194/398 [00:00<00:00, 2546.74it/s, Materializing param=text_model.encoder.layers.11.self_attn.v_proj.bias] Loading weights: 49%|####8 | 194/398 [00:00<00:00, 2542.64it/s, Materializing param=text_model.encoder.layers.11.self_attn.v_proj.bias] Loading weights: 49%|####8 | 195/398 [00:00<00:00, 2550.21it/s, Materializing param=text_model.encoder.layers.11.self_attn.v_proj.weight] Loading weights: 49%|####8 | 195/398 [00:00<00:00, 2546.02it/s, Materializing param=text_model.encoder.layers.11.self_attn.v_proj.weight] Loading weights: 49%|####9 | 196/398 [00:00<00:00, 2553.54it/s, Materializing param=text_model.final_layer_norm.bias] Loading weights: 49%|####9 | 196/398 [00:00<00:00, 2547.52it/s, Materializing param=text_model.final_layer_norm.bias] Loading weights: 49%|####9 | 197/398 [00:00<00:00, 2542.71it/s, Materializing param=text_model.final_layer_norm.weight] Loading weights: 49%|####9 | 197/398 [00:00<00:00, 2534.33it/s, Materializing param=text_model.final_layer_norm.weight] Loading weights: 50%|####9 | 198/398 [00:00<00:00, 2533.33it/s, Materializing param=text_projection.weight] Loading weights: 50%|####9 | 198/398 [00:00<00:00, 2525.07it/s, Materializing param=text_projection.weight] Loading weights: 50%|##### | 199/398 [00:00<00:00, 2526.25it/s, Materializing param=vision_model.embeddings.class_embedding] Loading weights: 50%|##### | 199/398 [00:00<00:00, 2520.28it/s, Materializing param=vision_model.embeddings.class_embedding] Loading weights: 50%|##### | 200/398 [00:00<00:00, 2526.31it/s, Materializing param=vision_model.embeddings.patch_embedding.weight] Loading weights: 50%|##### | 200/398 [00:00<00:00, 2522.02it/s, Materializing param=vision_model.embeddings.patch_embedding.weight] Loading weights: 51%|##### | 201/398 [00:00<00:00, 2528.86it/s, Materializing param=vision_model.embeddings.position_embedding.weight] Loading weights: 51%|##### | 201/398 [00:00<00:00, 2525.07it/s, Materializing param=vision_model.embeddings.position_embedding.weight] Loading weights: 51%|##### | 202/398 [00:00<00:00, 2532.24it/s, Materializing param=vision_model.encoder.layers.0.layer_norm1.bias] Loading weights: 51%|##### | 202/398 [00:00<00:00, 2526.95it/s, Materializing param=vision_model.encoder.layers.0.layer_norm1.bias] Loading weights: 51%|#####1 | 203/398 [00:00<00:00, 2531.72it/s, Materializing param=vision_model.encoder.layers.0.layer_norm1.weight] Loading weights: 51%|#####1 | 203/398 [00:00<00:00, 2526.45it/s, Materializing param=vision_model.encoder.layers.0.layer_norm1.weight] Loading weights: 51%|#####1 | 204/398 [00:00<00:00, 2532.67it/s, Materializing param=vision_model.encoder.layers.0.layer_norm2.bias] Loading weights: 51%|#####1 | 204/398 [00:00<00:00, 2528.50it/s, Materializing param=vision_model.encoder.layers.0.layer_norm2.bias] Loading weights: 52%|#####1 | 205/398 [00:00<00:00, 2535.66it/s, Materializing param=vision_model.encoder.layers.0.layer_norm2.weight] Loading weights: 52%|#####1 | 205/398 [00:00<00:00, 2529.44it/s, Materializing param=vision_model.encoder.layers.0.layer_norm2.weight] Loading weights: 52%|#####1 | 206/398 [00:00<00:00, 2533.66it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc1.bias] Loading weights: 52%|#####1 | 206/398 [00:00<00:00, 2527.81it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc1.bias] Loading weights: 52%|#####2 | 207/398 [00:00<00:00, 2533.01it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc1.weight] Loading weights: 52%|#####2 | 207/398 [00:00<00:00, 2528.01it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc1.weight] Loading weights: 52%|#####2 | 208/398 [00:00<00:00, 2534.28it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc2.bias] Loading weights: 52%|#####2 | 208/398 [00:00<00:00, 2530.52it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc2.bias] Loading weights: 53%|#####2 | 209/398 [00:00<00:00, 2537.33it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc2.weight] Loading weights: 53%|#####2 | 209/398 [00:00<00:00, 2532.44it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc2.weight] Loading weights: 53%|#####2 | 210/398 [00:00<00:00, 2534.72it/s, Materializing param=vision_model.encoder.layers.0.self_attn.k_proj.bias] Loading weights: 53%|#####2 | 210/398 [00:00<00:00, 2528.44it/s, Materializing param=vision_model.encoder.layers.0.self_attn.k_proj.bias] Loading weights: 53%|#####3 | 211/398 [00:00<00:00, 2534.74it/s, Materializing param=vision_model.encoder.layers.0.self_attn.k_proj.weight] Loading weights: 53%|#####3 | 211/398 [00:00<00:00, 2530.99it/s, Materializing param=vision_model.encoder.layers.0.self_attn.k_proj.weight] Loading weights: 53%|#####3 | 212/398 [00:00<00:00, 2536.33it/s, Materializing param=vision_model.encoder.layers.0.self_attn.out_proj.bias] Loading weights: 53%|#####3 | 212/398 [00:00<00:00, 2531.54it/s, Materializing param=vision_model.encoder.layers.0.self_attn.out_proj.bias] Loading weights: 54%|#####3 | 213/398 [00:00<00:00, 2538.24it/s, Materializing param=vision_model.encoder.layers.0.self_attn.out_proj.weight] Loading weights: 54%|#####3 | 213/398 [00:00<00:00, 2534.55it/s, Materializing param=vision_model.encoder.layers.0.self_attn.out_proj.weight] Loading weights: 54%|#####3 | 214/398 [00:00<00:00, 2541.27it/s, Materializing param=vision_model.encoder.layers.0.self_attn.q_proj.bias] Loading weights: 54%|#####3 | 214/398 [00:00<00:00, 2537.72it/s, Materializing param=vision_model.encoder.layers.0.self_attn.q_proj.bias] Loading weights: 54%|#####4 | 215/398 [00:00<00:00, 2543.44it/s, Materializing param=vision_model.encoder.layers.0.self_attn.q_proj.weight] Loading weights: 54%|#####4 | 215/398 [00:00<00:00, 2538.64it/s, Materializing param=vision_model.encoder.layers.0.self_attn.q_proj.weight] Loading weights: 54%|#####4 | 216/398 [00:00<00:00, 2544.48it/s, Materializing param=vision_model.encoder.layers.0.self_attn.v_proj.bias] Loading weights: 54%|#####4 | 216/398 [00:00<00:00, 2540.67it/s, Materializing param=vision_model.encoder.layers.0.self_attn.v_proj.bias] Loading weights: 55%|#####4 | 217/398 [00:00<00:00, 2547.35it/s, Materializing param=vision_model.encoder.layers.0.self_attn.v_proj.weight] Loading weights: 55%|#####4 | 217/398 [00:00<00:00, 2541.78it/s, Materializing param=vision_model.encoder.layers.0.self_attn.v_proj.weight] Loading weights: 55%|#####4 | 218/398 [00:00<00:00, 2545.58it/s, Materializing param=vision_model.encoder.layers.1.layer_norm1.bias] Loading weights: 55%|#####4 | 218/398 [00:00<00:00, 2539.21it/s, Materializing param=vision_model.encoder.layers.1.layer_norm1.bias] Loading weights: 55%|#####5 | 219/398 [00:00<00:00, 2543.40it/s, Materializing param=vision_model.encoder.layers.1.layer_norm1.weight] Loading weights: 55%|#####5 | 219/398 [00:00<00:00, 2538.21it/s, Materializing param=vision_model.encoder.layers.1.layer_norm1.weight] Loading weights: 55%|#####5 | 220/398 [00:00<00:00, 2543.99it/s, Materializing param=vision_model.encoder.layers.1.layer_norm2.bias] Loading weights: 55%|#####5 | 220/398 [00:00<00:00, 2539.36it/s, Materializing param=vision_model.encoder.layers.1.layer_norm2.bias] Loading weights: 56%|#####5 | 221/398 [00:00<00:00, 2544.58it/s, Materializing param=vision_model.encoder.layers.1.layer_norm2.weight] Loading weights: 56%|#####5 | 221/398 [00:00<00:00, 2539.50it/s, Materializing param=vision_model.encoder.layers.1.layer_norm2.weight] Loading weights: 56%|#####5 | 222/398 [00:00<00:00, 2545.14it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc1.bias] Loading weights: 56%|#####5 | 222/398 [00:00<00:00, 2541.30it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc1.bias] Loading weights: 56%|#####6 | 223/398 [00:00<00:00, 2547.70it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc1.weight] Loading weights: 56%|#####6 | 223/398 [00:00<00:00, 2544.19it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc1.weight] Loading weights: 56%|#####6 | 224/398 [00:00<00:00, 2550.58it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc2.bias] Loading weights: 56%|#####6 | 224/398 [00:00<00:00, 2546.78it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc2.bias] Loading weights: 57%|#####6 | 225/398 [00:00<00:00, 2553.18it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc2.weight] Loading weights: 57%|#####6 | 225/398 [00:00<00:00, 2549.60it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc2.weight] Loading weights: 57%|#####6 | 226/398 [00:00<00:00, 2556.35it/s, Materializing param=vision_model.encoder.layers.1.self_attn.k_proj.bias] Loading weights: 57%|#####6 | 226/398 [00:00<00:00, 2552.97it/s, Materializing param=vision_model.encoder.layers.1.self_attn.k_proj.bias] Loading weights: 57%|#####7 | 227/398 [00:00<00:00, 2559.09it/s, Materializing param=vision_model.encoder.layers.1.self_attn.k_proj.weight] Loading weights: 57%|#####7 | 227/398 [00:00<00:00, 2553.20it/s, Materializing param=vision_model.encoder.layers.1.self_attn.k_proj.weight] Loading weights: 57%|#####7 | 228/398 [00:00<00:00, 2557.67it/s, Materializing param=vision_model.encoder.layers.1.self_attn.out_proj.bias] Loading weights: 57%|#####7 | 228/398 [00:00<00:00, 2552.82it/s, Materializing param=vision_model.encoder.layers.1.self_attn.out_proj.bias] Loading weights: 58%|#####7 | 229/398 [00:00<00:00, 2557.48it/s, Materializing param=vision_model.encoder.layers.1.self_attn.out_proj.weight] Loading weights: 58%|#####7 | 229/398 [00:00<00:00, 2550.86it/s, Materializing param=vision_model.encoder.layers.1.self_attn.out_proj.weight] Loading weights: 58%|#####7 | 230/398 [00:00<00:00, 2556.03it/s, Materializing param=vision_model.encoder.layers.1.self_attn.q_proj.bias] Loading weights: 58%|#####7 | 230/398 [00:00<00:00, 2551.15it/s, Materializing param=vision_model.encoder.layers.1.self_attn.q_proj.bias] Loading weights: 58%|#####8 | 231/398 [00:00<00:00, 2555.36it/s, Materializing param=vision_model.encoder.layers.1.self_attn.q_proj.weight] Loading weights: 58%|#####8 | 231/398 [00:00<00:00, 2550.60it/s, Materializing param=vision_model.encoder.layers.1.self_attn.q_proj.weight] Loading weights: 58%|#####8 | 232/398 [00:00<00:00, 2556.19it/s, Materializing param=vision_model.encoder.layers.1.self_attn.v_proj.bias] Loading weights: 58%|#####8 | 232/398 [00:00<00:00, 2552.60it/s, Materializing param=vision_model.encoder.layers.1.self_attn.v_proj.bias] Loading weights: 59%|#####8 | 233/398 [00:00<00:00, 2558.13it/s, Materializing param=vision_model.encoder.layers.1.self_attn.v_proj.weight] Loading weights: 59%|#####8 | 233/398 [00:00<00:00, 2553.23it/s, Materializing param=vision_model.encoder.layers.1.self_attn.v_proj.weight] Loading weights: 59%|#####8 | 234/398 [00:00<00:00, 2557.45it/s, Materializing param=vision_model.encoder.layers.2.layer_norm1.bias] Loading weights: 59%|#####8 | 234/398 [00:00<00:00, 2553.17it/s, Materializing param=vision_model.encoder.layers.2.layer_norm1.bias] Loading weights: 59%|#####9 | 235/398 [00:00<00:00, 2559.07it/s, Materializing param=vision_model.encoder.layers.2.layer_norm1.weight] Loading weights: 59%|#####9 | 235/398 [00:00<00:00, 2555.78it/s, Materializing param=vision_model.encoder.layers.2.layer_norm1.weight] Loading weights: 59%|#####9 | 236/398 [00:00<00:00, 2562.26it/s, Materializing param=vision_model.encoder.layers.2.layer_norm2.bias] Loading weights: 59%|#####9 | 236/398 [00:00<00:00, 2559.20it/s, Materializing param=vision_model.encoder.layers.2.layer_norm2.bias] Loading weights: 60%|#####9 | 237/398 [00:00<00:00, 2565.83it/s, Materializing param=vision_model.encoder.layers.2.layer_norm2.weight] Loading weights: 60%|#####9 | 237/398 [00:00<00:00, 2562.55it/s, Materializing param=vision_model.encoder.layers.2.layer_norm2.weight] Loading weights: 60%|#####9 | 238/398 [00:00<00:00, 2567.19it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc1.bias] Loading weights: 60%|#####9 | 238/398 [00:00<00:00, 2561.43it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc1.bias] Loading weights: 60%|###### | 239/398 [00:00<00:00, 2566.60it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc1.weight] Loading weights: 60%|###### | 239/398 [00:00<00:00, 2562.61it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc1.weight] Loading weights: 60%|###### | 240/398 [00:00<00:00, 2565.97it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc2.bias] Loading weights: 60%|###### | 240/398 [00:00<00:00, 2560.92it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc2.bias] Loading weights: 61%|###### | 241/398 [00:00<00:00, 2559.94it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc2.weight] Loading weights: 61%|###### | 241/398 [00:00<00:00, 2550.41it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc2.weight] Loading weights: 61%|###### | 242/398 [00:00<00:00, 2548.82it/s, Materializing param=vision_model.encoder.layers.2.self_attn.k_proj.bias] Loading weights: 61%|###### | 242/398 [00:00<00:00, 2542.72it/s, Materializing param=vision_model.encoder.layers.2.self_attn.k_proj.bias] Loading weights: 61%|######1 | 243/398 [00:00<00:00, 2545.43it/s, Materializing param=vision_model.encoder.layers.2.self_attn.k_proj.weight] Loading weights: 61%|######1 | 243/398 [00:00<00:00, 2539.94it/s, Materializing param=vision_model.encoder.layers.2.self_attn.k_proj.weight] Loading weights: 61%|######1 | 244/398 [00:00<00:00, 2544.16it/s, Materializing param=vision_model.encoder.layers.2.self_attn.out_proj.bias] Loading weights: 61%|######1 | 244/398 [00:00<00:00, 2539.07it/s, Materializing param=vision_model.encoder.layers.2.self_attn.out_proj.bias] Loading weights: 62%|######1 | 245/398 [00:00<00:00, 2542.64it/s, Materializing param=vision_model.encoder.layers.2.self_attn.out_proj.weight] Loading weights: 62%|######1 | 245/398 [00:00<00:00, 2535.59it/s, Materializing param=vision_model.encoder.layers.2.self_attn.out_proj.weight] Loading weights: 62%|######1 | 246/398 [00:00<00:00, 2539.39it/s, Materializing param=vision_model.encoder.layers.2.self_attn.q_proj.bias] Loading weights: 62%|######1 | 246/398 [00:00<00:00, 2534.24it/s, Materializing param=vision_model.encoder.layers.2.self_attn.q_proj.bias] Loading weights: 62%|######2 | 247/398 [00:00<00:00, 2538.63it/s, Materializing param=vision_model.encoder.layers.2.self_attn.q_proj.weight] Loading weights: 62%|######2 | 247/398 [00:00<00:00, 2534.86it/s, Materializing param=vision_model.encoder.layers.2.self_attn.q_proj.weight] Loading weights: 62%|######2 | 248/398 [00:00<00:00, 2539.61it/s, Materializing param=vision_model.encoder.layers.2.self_attn.v_proj.bias] Loading weights: 62%|######2 | 248/398 [00:00<00:00, 2536.13it/s, Materializing param=vision_model.encoder.layers.2.self_attn.v_proj.bias] Loading weights: 63%|######2 | 249/398 [00:00<00:00, 2541.64it/s, Materializing param=vision_model.encoder.layers.2.self_attn.v_proj.weight] Loading weights: 63%|######2 | 249/398 [00:00<00:00, 2537.33it/s, Materializing param=vision_model.encoder.layers.2.self_attn.v_proj.weight] Loading weights: 63%|######2 | 250/398 [00:00<00:00, 2542.04it/s, Materializing param=vision_model.encoder.layers.3.layer_norm1.bias] Loading weights: 63%|######2 | 250/398 [00:00<00:00, 2538.46it/s, Materializing param=vision_model.encoder.layers.3.layer_norm1.bias] Loading weights: 63%|######3 | 251/398 [00:00<00:00, 2543.81it/s, Materializing param=vision_model.encoder.layers.3.layer_norm1.weight] Loading weights: 63%|######3 | 251/398 [00:00<00:00, 2540.24it/s, Materializing param=vision_model.encoder.layers.3.layer_norm1.weight] Loading weights: 63%|######3 | 252/398 [00:00<00:00, 2545.57it/s, Materializing param=vision_model.encoder.layers.3.layer_norm2.bias] Loading weights: 63%|######3 | 252/398 [00:00<00:00, 2542.43it/s, Materializing param=vision_model.encoder.layers.3.layer_norm2.bias] Loading weights: 64%|######3 | 253/398 [00:00<00:00, 2548.15it/s, Materializing param=vision_model.encoder.layers.3.layer_norm2.weight] Loading weights: 64%|######3 | 253/398 [00:00<00:00, 2545.08it/s, Materializing param=vision_model.encoder.layers.3.layer_norm2.weight] Loading weights: 64%|######3 | 254/398 [00:00<00:00, 2550.82it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.bias] Loading weights: 64%|######3 | 254/398 [00:00<00:00, 2547.73it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.bias] Loading weights: 64%|######4 | 255/398 [00:00<00:00, 2551.72it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.weight] Loading weights: 64%|######4 | 255/398 [00:00<00:00, 2546.88it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.weight] Loading weights: 64%|######4 | 256/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.weight] Loading weights: 64%|######4 | 256/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc2.bias] Loading weights: 64%|######4 | 256/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc2.bias] Loading weights: 65%|######4 | 257/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc2.weight] Loading weights: 65%|######4 | 257/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc2.weight] Loading weights: 65%|######4 | 258/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.k_proj.bias] Loading weights: 65%|######4 | 258/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.k_proj.bias] Loading weights: 65%|######5 | 259/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.k_proj.weight] Loading weights: 65%|######5 | 259/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.k_proj.weight] Loading weights: 65%|######5 | 260/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.out_proj.bias] Loading weights: 65%|######5 | 260/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.out_proj.bias] Loading weights: 66%|######5 | 261/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.out_proj.weight] Loading weights: 66%|######5 | 261/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.out_proj.weight] Loading weights: 66%|######5 | 262/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.q_proj.bias] Loading weights: 66%|######5 | 262/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.q_proj.bias] Loading weights: 66%|######6 | 263/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.q_proj.weight] Loading weights: 66%|######6 | 263/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.q_proj.weight] Loading weights: 66%|######6 | 264/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.v_proj.bias] Loading weights: 66%|######6 | 264/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.v_proj.bias] Loading weights: 67%|######6 | 265/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.v_proj.weight] Loading weights: 67%|######6 | 265/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.v_proj.weight] Loading weights: 67%|######6 | 266/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm1.bias] Loading weights: 67%|######6 | 266/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm1.bias] Loading weights: 67%|######7 | 267/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm1.weight] Loading weights: 67%|######7 | 267/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm1.weight] Loading weights: 67%|######7 | 268/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm2.bias] Loading weights: 67%|######7 | 268/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm2.bias] Loading weights: 68%|######7 | 269/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm2.weight] Loading weights: 68%|######7 | 269/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm2.weight] Loading weights: 68%|######7 | 270/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc1.bias] Loading weights: 68%|######7 | 270/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc1.bias] Loading weights: 68%|######8 | 271/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc1.weight] Loading weights: 68%|######8 | 271/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc1.weight] Loading weights: 68%|######8 | 272/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc2.bias] Loading weights: 68%|######8 | 272/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc2.bias] Loading weights: 69%|######8 | 273/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc2.weight] Loading weights: 69%|######8 | 273/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc2.weight] Loading weights: 69%|######8 | 274/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.k_proj.bias] Loading weights: 69%|######8 | 274/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.k_proj.bias] Loading weights: 69%|######9 | 275/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.k_proj.weight] Loading weights: 69%|######9 | 275/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.k_proj.weight] Loading weights: 69%|######9 | 276/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.out_proj.bias] Loading weights: 69%|######9 | 276/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.out_proj.bias] Loading weights: 70%|######9 | 277/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.out_proj.weight] Loading weights: 70%|######9 | 277/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.out_proj.weight] Loading weights: 70%|######9 | 278/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.q_proj.bias] Loading weights: 70%|######9 | 278/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.q_proj.bias] Loading weights: 70%|####### | 279/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.q_proj.weight] Loading weights: 70%|####### | 279/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.q_proj.weight] Loading weights: 70%|####### | 280/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.v_proj.bias] Loading weights: 70%|####### | 280/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.v_proj.bias] Loading weights: 71%|####### | 281/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.v_proj.weight] Loading weights: 71%|####### | 281/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.v_proj.weight] Loading weights: 71%|####### | 282/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm1.bias] Loading weights: 71%|####### | 282/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm1.bias] Loading weights: 71%|#######1 | 283/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm1.weight] Loading weights: 71%|#######1 | 283/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm1.weight] Loading weights: 71%|#######1 | 284/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm2.bias] Loading weights: 71%|#######1 | 284/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm2.bias] Loading weights: 72%|#######1 | 285/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm2.weight] Loading weights: 72%|#######1 | 285/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm2.weight] Loading weights: 72%|#######1 | 286/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc1.bias] Loading weights: 72%|#######1 | 286/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc1.bias] Loading weights: 72%|#######2 | 287/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc1.weight] Loading weights: 72%|#######2 | 287/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc1.weight] Loading weights: 72%|#######2 | 288/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc2.bias] Loading weights: 72%|#######2 | 288/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc2.bias] Loading weights: 73%|#######2 | 289/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc2.weight] Loading weights: 73%|#######2 | 289/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc2.weight] Loading weights: 73%|#######2 | 290/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.k_proj.bias] Loading weights: 73%|#######2 | 290/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.k_proj.bias] Loading weights: 73%|#######3 | 291/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.k_proj.weight] Loading weights: 73%|#######3 | 291/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.k_proj.weight] Loading weights: 73%|#######3 | 292/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.out_proj.bias] Loading weights: 73%|#######3 | 292/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.out_proj.bias] Loading weights: 74%|#######3 | 293/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.out_proj.weight] Loading weights: 74%|#######3 | 293/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.out_proj.weight] Loading weights: 74%|#######3 | 294/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.q_proj.bias] Loading weights: 74%|#######3 | 294/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.q_proj.bias] Loading weights: 74%|#######4 | 295/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.q_proj.weight] Loading weights: 74%|#######4 | 295/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.q_proj.weight] Loading weights: 74%|#######4 | 296/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.v_proj.bias] Loading weights: 74%|#######4 | 296/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.v_proj.bias] Loading weights: 75%|#######4 | 297/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.v_proj.weight] Loading weights: 75%|#######4 | 297/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.v_proj.weight] Loading weights: 75%|#######4 | 298/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm1.bias] Loading weights: 75%|#######4 | 298/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm1.bias] Loading weights: 75%|#######5 | 299/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm1.weight] Loading weights: 75%|#######5 | 299/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm1.weight] Loading weights: 75%|#######5 | 300/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm2.bias] Loading weights: 75%|#######5 | 300/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm2.bias] Loading weights: 76%|#######5 | 301/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm2.weight] Loading weights: 76%|#######5 | 301/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm2.weight] Loading weights: 76%|#######5 | 302/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc1.bias] Loading weights: 76%|#######5 | 302/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc1.bias] Loading weights: 76%|#######6 | 303/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc1.weight] Loading weights: 76%|#######6 | 303/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc1.weight] Loading weights: 76%|#######6 | 304/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc2.bias] Loading weights: 76%|#######6 | 304/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc2.bias] Loading weights: 77%|#######6 | 305/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc2.weight] Loading weights: 77%|#######6 | 305/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc2.weight] Loading weights: 77%|#######6 | 306/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.k_proj.bias] Loading weights: 77%|#######6 | 306/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.k_proj.bias] Loading weights: 77%|#######7 | 307/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.k_proj.weight] Loading weights: 77%|#######7 | 307/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.k_proj.weight] Loading weights: 77%|#######7 | 308/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.out_proj.bias] Loading weights: 77%|#######7 | 308/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.out_proj.bias] Loading weights: 78%|#######7 | 309/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.out_proj.weight] Loading weights: 78%|#######7 | 309/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.out_proj.weight] Loading weights: 78%|#######7 | 310/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.q_proj.bias] Loading weights: 78%|#######7 | 310/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.q_proj.bias] Loading weights: 78%|#######8 | 311/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.q_proj.weight] Loading weights: 78%|#######8 | 311/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.q_proj.weight] Loading weights: 78%|#######8 | 312/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.v_proj.bias] Loading weights: 78%|#######8 | 312/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.v_proj.bias] Loading weights: 79%|#######8 | 313/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.v_proj.weight] Loading weights: 79%|#######8 | 313/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.v_proj.weight] Loading weights: 79%|#######8 | 314/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm1.bias] Loading weights: 79%|#######8 | 314/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm1.bias] Loading weights: 79%|#######9 | 315/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm1.weight] Loading weights: 79%|#######9 | 315/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm1.weight] Loading weights: 79%|#######9 | 316/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm2.bias] Loading weights: 79%|#######9 | 316/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm2.bias] Loading weights: 80%|#######9 | 317/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm2.weight] Loading weights: 80%|#######9 | 317/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm2.weight] Loading weights: 80%|#######9 | 318/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc1.bias] Loading weights: 80%|#######9 | 318/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc1.bias] Loading weights: 80%|######## | 319/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc1.weight] Loading weights: 80%|######## | 319/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc1.weight] Loading weights: 80%|######## | 320/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc2.bias] Loading weights: 80%|######## | 320/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc2.bias] Loading weights: 81%|######## | 321/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc2.weight] Loading weights: 81%|######## | 321/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc2.weight] Loading weights: 81%|######## | 322/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.k_proj.bias] Loading weights: 81%|######## | 322/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.k_proj.bias] Loading weights: 81%|########1 | 323/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.k_proj.weight] Loading weights: 81%|########1 | 323/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.k_proj.weight] Loading weights: 81%|########1 | 324/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.out_proj.bias] Loading weights: 81%|########1 | 324/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.out_proj.bias] Loading weights: 82%|########1 | 325/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.out_proj.weight] Loading weights: 82%|########1 | 325/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.out_proj.weight] Loading weights: 82%|########1 | 326/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.q_proj.bias] Loading weights: 82%|########1 | 326/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.q_proj.bias] Loading weights: 82%|########2 | 327/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.q_proj.weight] Loading weights: 82%|########2 | 327/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.q_proj.weight] Loading weights: 82%|########2 | 328/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.v_proj.bias] Loading weights: 82%|########2 | 328/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.v_proj.bias] Loading weights: 83%|########2 | 329/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.v_proj.weight] Loading weights: 83%|########2 | 329/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.v_proj.weight] Loading weights: 83%|########2 | 330/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm1.bias] Loading weights: 83%|########2 | 330/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm1.bias] Loading weights: 83%|########3 | 331/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm1.weight] Loading weights: 83%|########3 | 331/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm1.weight] Loading weights: 83%|########3 | 332/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm2.bias] Loading weights: 83%|########3 | 332/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm2.bias] Loading weights: 84%|########3 | 333/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm2.weight] Loading weights: 84%|########3 | 333/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm2.weight] Loading weights: 84%|########3 | 334/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc1.bias] Loading weights: 84%|########3 | 334/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc1.bias] Loading weights: 84%|########4 | 335/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc1.weight] Loading weights: 84%|########4 | 335/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc1.weight] Loading weights: 84%|########4 | 336/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc2.bias] Loading weights: 84%|########4 | 336/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc2.bias] Loading weights: 85%|########4 | 337/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc2.weight] Loading weights: 85%|########4 | 337/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc2.weight] Loading weights: 85%|########4 | 338/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.k_proj.bias] Loading weights: 85%|########4 | 338/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.k_proj.bias] Loading weights: 85%|########5 | 339/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.k_proj.weight] Loading weights: 85%|########5 | 339/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.k_proj.weight] Loading weights: 85%|########5 | 340/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.out_proj.bias] Loading weights: 85%|########5 | 340/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.out_proj.bias] Loading weights: 86%|########5 | 341/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.out_proj.weight] Loading weights: 86%|########5 | 341/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.out_proj.weight] Loading weights: 86%|########5 | 342/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.q_proj.bias] Loading weights: 86%|########5 | 342/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.q_proj.bias] Loading weights: 86%|########6 | 343/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.q_proj.weight] Loading weights: 86%|########6 | 343/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.q_proj.weight] Loading weights: 86%|########6 | 344/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.v_proj.bias] Loading weights: 86%|########6 | 344/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.v_proj.bias] Loading weights: 87%|########6 | 345/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.v_proj.weight] Loading weights: 87%|########6 | 345/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.v_proj.weight] Loading weights: 87%|########6 | 346/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm1.bias] Loading weights: 87%|########6 | 346/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm1.bias] Loading weights: 87%|########7 | 347/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm1.weight] Loading weights: 87%|########7 | 347/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm1.weight] Loading weights: 87%|########7 | 348/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm2.bias] Loading weights: 87%|########7 | 348/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm2.bias] Loading weights: 88%|########7 | 349/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm2.weight] Loading weights: 88%|########7 | 349/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm2.weight] Loading weights: 88%|########7 | 350/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc1.bias] Loading weights: 88%|########7 | 350/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc1.bias] Loading weights: 88%|########8 | 351/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc1.weight] Loading weights: 88%|########8 | 351/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc1.weight] Loading weights: 88%|########8 | 352/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc2.bias] Loading weights: 88%|########8 | 352/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc2.bias] Loading weights: 89%|########8 | 353/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc2.weight] Loading weights: 89%|########8 | 353/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc2.weight] Loading weights: 89%|########8 | 354/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.k_proj.bias] Loading weights: 89%|########8 | 354/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.k_proj.bias] Loading weights: 89%|########9 | 355/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.k_proj.weight] Loading weights: 89%|########9 | 355/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.k_proj.weight] Loading weights: 89%|########9 | 356/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.out_proj.bias] Loading weights: 89%|########9 | 356/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.out_proj.bias] Loading weights: 90%|########9 | 357/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.out_proj.weight] Loading weights: 90%|########9 | 357/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.out_proj.weight] Loading weights: 90%|########9 | 358/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.q_proj.bias] Loading weights: 90%|########9 | 358/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.q_proj.bias] Loading weights: 90%|######### | 359/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.q_proj.weight] Loading weights: 90%|######### | 359/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.q_proj.weight] Loading weights: 90%|######### | 360/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.v_proj.bias] Loading weights: 90%|######### | 360/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.v_proj.bias] Loading weights: 91%|######### | 361/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.v_proj.weight] Loading weights: 91%|######### | 361/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.v_proj.weight] Loading weights: 91%|######### | 362/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm1.bias] Loading weights: 91%|######### | 362/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm1.bias] Loading weights: 91%|#########1| 363/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm1.weight] Loading weights: 91%|#########1| 363/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm1.weight] Loading weights: 91%|#########1| 364/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm2.bias] Loading weights: 91%|#########1| 364/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm2.bias] Loading weights: 92%|#########1| 365/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm2.weight] Loading weights: 92%|#########1| 365/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm2.weight] Loading weights: 92%|#########1| 366/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc1.bias] Loading weights: 92%|#########1| 366/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc1.bias] Loading weights: 92%|#########2| 367/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc1.weight] Loading weights: 92%|#########2| 367/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc1.weight] Loading weights: 92%|#########2| 368/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc2.bias] Loading weights: 92%|#########2| 368/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc2.bias] Loading weights: 93%|#########2| 369/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc2.weight] Loading weights: 93%|#########2| 369/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc2.weight] Loading weights: 93%|#########2| 370/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.k_proj.bias] Loading weights: 93%|#########2| 370/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.k_proj.bias] Loading weights: 93%|#########3| 371/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.k_proj.weight] Loading weights: 93%|#########3| 371/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.k_proj.weight] Loading weights: 93%|#########3| 372/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.out_proj.bias] Loading weights: 93%|#########3| 372/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.out_proj.bias] Loading weights: 94%|#########3| 373/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.out_proj.weight] Loading weights: 94%|#########3| 373/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.out_proj.weight] Loading weights: 94%|#########3| 374/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.q_proj.bias] Loading weights: 94%|#########3| 374/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.q_proj.bias] Loading weights: 94%|#########4| 375/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.q_proj.weight] Loading weights: 94%|#########4| 375/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.q_proj.weight] Loading weights: 94%|#########4| 376/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.v_proj.bias] Loading weights: 94%|#########4| 376/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.v_proj.bias] Loading weights: 95%|#########4| 377/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.v_proj.weight] Loading weights: 95%|#########4| 377/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.v_proj.weight] Loading weights: 95%|#########4| 378/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm1.bias] Loading weights: 95%|#########4| 378/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm1.bias] Loading weights: 95%|#########5| 379/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm1.weight] Loading weights: 95%|#########5| 379/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm1.weight] Loading weights: 95%|#########5| 380/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm2.bias] Loading weights: 95%|#########5| 380/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm2.bias] Loading weights: 96%|#########5| 381/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm2.weight] Loading weights: 96%|#########5| 381/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm2.weight] Loading weights: 96%|#########5| 382/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc1.bias] Loading weights: 96%|#########5| 382/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc1.bias] Loading weights: 96%|#########6| 383/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc1.weight] Loading weights: 96%|#########6| 383/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc1.weight] Loading weights: 96%|#########6| 384/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc2.bias] Loading weights: 96%|#########6| 384/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc2.bias] Loading weights: 97%|#########6| 385/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc2.weight] Loading weights: 97%|#########6| 385/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc2.weight] Loading weights: 97%|#########6| 386/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.k_proj.bias] Loading weights: 97%|#########6| 386/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.k_proj.bias] Loading weights: 97%|#########7| 387/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.k_proj.weight] Loading weights: 97%|#########7| 387/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.k_proj.weight] Loading weights: 97%|#########7| 388/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.out_proj.bias] Loading weights: 97%|#########7| 388/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.out_proj.bias] Loading weights: 98%|#########7| 389/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.out_proj.weight] Loading weights: 98%|#########7| 389/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.out_proj.weight] Loading weights: 98%|#########7| 390/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.q_proj.bias] Loading weights: 98%|#########7| 390/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.q_proj.bias] Loading weights: 98%|#########8| 391/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.q_proj.weight] Loading weights: 98%|#########8| 391/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.q_proj.weight] Loading weights: 98%|#########8| 392/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.v_proj.bias] Loading weights: 98%|#########8| 392/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.v_proj.bias] Loading weights: 99%|#########8| 393/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.v_proj.weight] Loading weights: 99%|#########8| 393/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.v_proj.weight] Loading weights: 99%|#########8| 394/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.post_layernorm.bias] Loading weights: 99%|#########8| 394/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.post_layernorm.bias] Loading weights: 99%|#########9| 395/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.post_layernorm.weight] Loading weights: 99%|#########9| 395/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.post_layernorm.weight] Loading weights: 99%|#########9| 396/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.pre_layrnorm.bias] Loading weights: 99%|#########9| 396/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.pre_layrnorm.bias] Loading weights: 100%|#########9| 397/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.pre_layrnorm.weight] Loading weights: 100%|#########9| 397/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.pre_layrnorm.weight] Loading weights: 100%|##########| 398/398 [00:00<00:00, 2550.96it/s, Materializing param=visual_projection.weight] Loading weights: 100%|##########| 398/398 [00:00<00:00, 2550.96it/s, Materializing param=visual_projection.weight] Loading weights: 100%|##########| 398/398 [00:00<00:00, 2569.82it/s, Materializing param=visual_projection.weight]
CLIPModel LOAD REPORT from: openai/clip-vit-base-patch32
Key | Status | |
-------------------------------------+------------+--+-
vision_model.embeddings.position_ids | UNEXPECTED | |
text_model.embeddings.position_ids | UNEXPECTED | |
Notes:
- UNEXPECTED :can be ignored when loading from different task/architecture; not ok if you expect identical arch.
The image processor of type `CLIPImageProcessor` is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with `use_fast=False`.
INFO: Started server process [47308]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit)