File size: 111,834 Bytes
e666301
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

Loading weights:   0%|          | 0/398 [00:00<?, ?it/s]
Loading weights:   0%|          | 1/398 [00:00<00:00, 15420.24it/s, Materializing param=logit_scale]
Loading weights:   0%|          | 1/398 [00:00<00:00, 4245.25it/s, Materializing param=logit_scale] 
Loading weights:   1%|          | 2/398 [00:00<00:00, 4090.01it/s, Materializing param=text_model.embeddings.position_embedding.weight]
Loading weights:   1%|          | 2/398 [00:00<00:00, 3133.59it/s, Materializing param=text_model.embeddings.position_embedding.weight]
Loading weights:   1%|          | 3/398 [00:00<00:00, 3492.34it/s, Materializing param=text_model.embeddings.token_embedding.weight]   
Loading weights:   1%|          | 3/398 [00:00<00:00, 3044.50it/s, Materializing param=text_model.embeddings.token_embedding.weight]
Loading weights:   1%|1         | 4/398 [00:00<00:00, 3407.93it/s, Materializing param=text_model.encoder.layers.0.layer_norm1.bias]
Loading weights:   1%|1         | 4/398 [00:00<00:00, 3049.29it/s, Materializing param=text_model.encoder.layers.0.layer_norm1.bias]
Loading weights:   1%|1         | 5/398 [00:00<00:00, 3274.75it/s, Materializing param=text_model.encoder.layers.0.layer_norm1.weight]
Loading weights:   1%|1         | 5/398 [00:00<00:00, 2919.20it/s, Materializing param=text_model.encoder.layers.0.layer_norm1.weight]
Loading weights:   2%|1         | 6/398 [00:00<00:00, 3128.91it/s, Materializing param=text_model.encoder.layers.0.layer_norm2.bias]  
Loading weights:   2%|1         | 6/398 [00:00<00:00, 2933.76it/s, Materializing param=text_model.encoder.layers.0.layer_norm2.bias]
Loading weights:   2%|1         | 7/398 [00:00<00:00, 3167.90it/s, Materializing param=text_model.encoder.layers.0.layer_norm2.weight]
Loading weights:   2%|1         | 7/398 [00:00<00:00, 3009.75it/s, Materializing param=text_model.encoder.layers.0.layer_norm2.weight]
Loading weights:   2%|2         | 8/398 [00:00<00:00, 3096.28it/s, Materializing param=text_model.encoder.layers.0.mlp.fc1.bias]      
Loading weights:   2%|2         | 8/398 [00:00<00:00, 2838.30it/s, Materializing param=text_model.encoder.layers.0.mlp.fc1.bias]
Loading weights:   2%|2         | 9/398 [00:00<00:00, 2929.44it/s, Materializing param=text_model.encoder.layers.0.mlp.fc1.weight]
Loading weights:   2%|2         | 9/398 [00:00<00:00, 2814.97it/s, Materializing param=text_model.encoder.layers.0.mlp.fc1.weight]
Loading weights:   3%|2         | 10/398 [00:00<00:00, 2976.79it/s, Materializing param=text_model.encoder.layers.0.mlp.fc2.bias] 
Loading weights:   3%|2         | 10/398 [00:00<00:00, 2870.45it/s, Materializing param=text_model.encoder.layers.0.mlp.fc2.bias]
Loading weights:   3%|2         | 11/398 [00:00<00:00, 3015.32it/s, Materializing param=text_model.encoder.layers.0.mlp.fc2.weight]
Loading weights:   3%|2         | 11/398 [00:00<00:00, 2919.90it/s, Materializing param=text_model.encoder.layers.0.mlp.fc2.weight]
Loading weights:   3%|3         | 12/398 [00:00<00:00, 3059.49it/s, Materializing param=text_model.encoder.layers.0.self_attn.k_proj.bias]
Loading weights:   3%|3         | 12/398 [00:00<00:00, 2975.91it/s, Materializing param=text_model.encoder.layers.0.self_attn.k_proj.bias]
Loading weights:   3%|3         | 13/398 [00:00<00:00, 3015.65it/s, Materializing param=text_model.encoder.layers.0.self_attn.k_proj.weight]
Loading weights:   3%|3         | 13/398 [00:00<00:00, 2858.50it/s, Materializing param=text_model.encoder.layers.0.self_attn.k_proj.weight]
Loading weights:   4%|3         | 14/398 [00:00<00:00, 2855.63it/s, Materializing param=text_model.encoder.layers.0.self_attn.out_proj.bias]
Loading weights:   4%|3         | 14/398 [00:00<00:00, 2713.13it/s, Materializing param=text_model.encoder.layers.0.self_attn.out_proj.bias]
Loading weights:   4%|3         | 15/398 [00:00<00:00, 2702.05it/s, Materializing param=text_model.encoder.layers.0.self_attn.out_proj.weight]
Loading weights:   4%|3         | 15/398 [00:00<00:00, 2620.24it/s, Materializing param=text_model.encoder.layers.0.self_attn.out_proj.weight]
Loading weights:   4%|4         | 16/398 [00:00<00:00, 2687.36it/s, Materializing param=text_model.encoder.layers.0.self_attn.q_proj.bias]    
Loading weights:   4%|4         | 16/398 [00:00<00:00, 2505.84it/s, Materializing param=text_model.encoder.layers.0.self_attn.q_proj.bias]
Loading weights:   4%|4         | 17/398 [00:00<00:00, 2536.13it/s, Materializing param=text_model.encoder.layers.0.self_attn.q_proj.weight]
Loading weights:   4%|4         | 17/398 [00:00<00:00, 2449.52it/s, Materializing param=text_model.encoder.layers.0.self_attn.q_proj.weight]
Loading weights:   5%|4         | 18/398 [00:00<00:00, 2486.09it/s, Materializing param=text_model.encoder.layers.0.self_attn.v_proj.bias]  
Loading weights:   5%|4         | 18/398 [00:00<00:00, 2409.75it/s, Materializing param=text_model.encoder.layers.0.self_attn.v_proj.bias]
Loading weights:   5%|4         | 19/398 [00:00<00:00, 2461.67it/s, Materializing param=text_model.encoder.layers.0.self_attn.v_proj.weight]
Loading weights:   5%|4         | 19/398 [00:00<00:00, 2406.52it/s, Materializing param=text_model.encoder.layers.0.self_attn.v_proj.weight]
Loading weights:   5%|5         | 20/398 [00:00<00:00, 2466.29it/s, Materializing param=text_model.encoder.layers.1.layer_norm1.bias]       
Loading weights:   5%|5         | 20/398 [00:00<00:00, 2405.61it/s, Materializing param=text_model.encoder.layers.1.layer_norm1.bias]
Loading weights:   5%|5         | 21/398 [00:00<00:00, 2458.90it/s, Materializing param=text_model.encoder.layers.1.layer_norm1.weight]
Loading weights:   5%|5         | 21/398 [00:00<00:00, 2413.76it/s, Materializing param=text_model.encoder.layers.1.layer_norm1.weight]
Loading weights:   6%|5         | 22/398 [00:00<00:00, 2461.18it/s, Materializing param=text_model.encoder.layers.1.layer_norm2.bias]  
Loading weights:   6%|5         | 22/398 [00:00<00:00, 2398.55it/s, Materializing param=text_model.encoder.layers.1.layer_norm2.bias]
Loading weights:   6%|5         | 23/398 [00:00<00:00, 2448.95it/s, Materializing param=text_model.encoder.layers.1.layer_norm2.weight]
Loading weights:   6%|5         | 23/398 [00:00<00:00, 2399.49it/s, Materializing param=text_model.encoder.layers.1.layer_norm2.weight]
Loading weights:   6%|6         | 24/398 [00:00<00:00, 2454.42it/s, Materializing param=text_model.encoder.layers.1.mlp.fc1.bias]      
Loading weights:   6%|6         | 24/398 [00:00<00:00, 2319.11it/s, Materializing param=text_model.encoder.layers.1.mlp.fc1.bias]
Loading weights:   6%|6         | 25/398 [00:00<00:00, 2224.86it/s, Materializing param=text_model.encoder.layers.1.mlp.fc1.weight]
Loading weights:   6%|6         | 25/398 [00:00<00:00, 2122.41it/s, Materializing param=text_model.encoder.layers.1.mlp.fc1.weight]
Loading weights:   7%|6         | 26/398 [00:00<00:00, 2125.27it/s, Materializing param=text_model.encoder.layers.1.mlp.fc2.bias]  
Loading weights:   7%|6         | 26/398 [00:00<00:00, 2094.90it/s, Materializing param=text_model.encoder.layers.1.mlp.fc2.bias]
Loading weights:   7%|6         | 27/398 [00:00<00:00, 2121.07it/s, Materializing param=text_model.encoder.layers.1.mlp.fc2.weight]
Loading weights:   7%|6         | 27/398 [00:00<00:00, 2087.10it/s, Materializing param=text_model.encoder.layers.1.mlp.fc2.weight]
Loading weights:   7%|7         | 28/398 [00:00<00:00, 2126.08it/s, Materializing param=text_model.encoder.layers.1.self_attn.k_proj.bias]
Loading weights:   7%|7         | 28/398 [00:00<00:00, 2097.71it/s, Materializing param=text_model.encoder.layers.1.self_attn.k_proj.bias]
Loading weights:   7%|7         | 29/398 [00:00<00:00, 2141.80it/s, Materializing param=text_model.encoder.layers.1.self_attn.k_proj.weight]
Loading weights:   7%|7         | 29/398 [00:00<00:00, 2119.85it/s, Materializing param=text_model.encoder.layers.1.self_attn.k_proj.weight]
Loading weights:   8%|7         | 30/398 [00:00<00:00, 2154.79it/s, Materializing param=text_model.encoder.layers.1.self_attn.out_proj.bias]
Loading weights:   8%|7         | 30/398 [00:00<00:00, 2106.88it/s, Materializing param=text_model.encoder.layers.1.self_attn.out_proj.bias]
Loading weights:   8%|7         | 31/398 [00:00<00:00, 2149.18it/s, Materializing param=text_model.encoder.layers.1.self_attn.out_proj.weight]
Loading weights:   8%|7         | 31/398 [00:00<00:00, 2129.79it/s, Materializing param=text_model.encoder.layers.1.self_attn.out_proj.weight]
Loading weights:   8%|8         | 32/398 [00:00<00:00, 2173.50it/s, Materializing param=text_model.encoder.layers.1.self_attn.q_proj.bias]    
Loading weights:   8%|8         | 32/398 [00:00<00:00, 2156.87it/s, Materializing param=text_model.encoder.layers.1.self_attn.q_proj.bias]
Loading weights:   8%|8         | 33/398 [00:00<00:00, 2201.24it/s, Materializing param=text_model.encoder.layers.1.self_attn.q_proj.weight]
Loading weights:   8%|8         | 33/398 [00:00<00:00, 2184.33it/s, Materializing param=text_model.encoder.layers.1.self_attn.q_proj.weight]
Loading weights:   9%|8         | 34/398 [00:00<00:00, 2227.63it/s, Materializing param=text_model.encoder.layers.1.self_attn.v_proj.bias]  
Loading weights:   9%|8         | 34/398 [00:00<00:00, 2211.33it/s, Materializing param=text_model.encoder.layers.1.self_attn.v_proj.bias]
Loading weights:   9%|8         | 35/398 [00:00<00:00, 2251.85it/s, Materializing param=text_model.encoder.layers.1.self_attn.v_proj.weight]
Loading weights:   9%|8         | 35/398 [00:00<00:00, 2224.93it/s, Materializing param=text_model.encoder.layers.1.self_attn.v_proj.weight]
Loading weights:   9%|9         | 36/398 [00:00<00:00, 2253.62it/s, Materializing param=text_model.encoder.layers.2.layer_norm1.bias]       
Loading weights:   9%|9         | 36/398 [00:00<00:00, 2227.66it/s, Materializing param=text_model.encoder.layers.2.layer_norm1.bias]
Loading weights:   9%|9         | 37/398 [00:00<00:00, 2262.66it/s, Materializing param=text_model.encoder.layers.2.layer_norm1.weight]
Loading weights:   9%|9         | 37/398 [00:00<00:00, 2234.26it/s, Materializing param=text_model.encoder.layers.2.layer_norm1.weight]
Loading weights:  10%|9         | 38/398 [00:00<00:00, 2260.34it/s, Materializing param=text_model.encoder.layers.2.layer_norm2.bias]  
Loading weights:  10%|9         | 38/398 [00:00<00:00, 2233.98it/s, Materializing param=text_model.encoder.layers.2.layer_norm2.bias]
Loading weights:  10%|9         | 39/398 [00:00<00:00, 2254.26it/s, Materializing param=text_model.encoder.layers.2.layer_norm2.weight]
Loading weights:  10%|9         | 39/398 [00:00<00:00, 2234.27it/s, Materializing param=text_model.encoder.layers.2.layer_norm2.weight]
Loading weights:  10%|#         | 40/398 [00:00<00:00, 2267.19it/s, Materializing param=text_model.encoder.layers.2.mlp.fc1.bias]      
Loading weights:  10%|#         | 40/398 [00:00<00:00, 2251.04it/s, Materializing param=text_model.encoder.layers.2.mlp.fc1.bias]
Loading weights:  10%|#         | 41/398 [00:00<00:00, 2285.42it/s, Materializing param=text_model.encoder.layers.2.mlp.fc1.weight]
Loading weights:  10%|#         | 41/398 [00:00<00:00, 2264.12it/s, Materializing param=text_model.encoder.layers.2.mlp.fc1.weight]
Loading weights:  11%|#         | 42/398 [00:00<00:00, 2291.55it/s, Materializing param=text_model.encoder.layers.2.mlp.fc2.bias]  
Loading weights:  11%|#         | 42/398 [00:00<00:00, 2273.54it/s, Materializing param=text_model.encoder.layers.2.mlp.fc2.bias]
Loading weights:  11%|#         | 43/398 [00:00<00:00, 2297.90it/s, Materializing param=text_model.encoder.layers.2.mlp.fc2.weight]
Loading weights:  11%|#         | 43/398 [00:00<00:00, 2278.30it/s, Materializing param=text_model.encoder.layers.2.mlp.fc2.weight]
Loading weights:  11%|#1        | 44/398 [00:00<00:00, 2308.48it/s, Materializing param=text_model.encoder.layers.2.self_attn.k_proj.bias]
Loading weights:  11%|#1        | 44/398 [00:00<00:00, 2287.71it/s, Materializing param=text_model.encoder.layers.2.self_attn.k_proj.bias]
Loading weights:  11%|#1        | 45/398 [00:00<00:00, 2300.27it/s, Materializing param=text_model.encoder.layers.2.self_attn.k_proj.weight]
Loading weights:  11%|#1        | 45/398 [00:00<00:00, 2276.24it/s, Materializing param=text_model.encoder.layers.2.self_attn.k_proj.weight]
Loading weights:  12%|#1        | 46/398 [00:00<00:00, 2295.90it/s, Materializing param=text_model.encoder.layers.2.self_attn.out_proj.bias]
Loading weights:  12%|#1        | 46/398 [00:00<00:00, 2272.32it/s, Materializing param=text_model.encoder.layers.2.self_attn.out_proj.bias]
Loading weights:  12%|#1        | 47/398 [00:00<00:00, 2295.84it/s, Materializing param=text_model.encoder.layers.2.self_attn.out_proj.weight]
Loading weights:  12%|#1        | 47/398 [00:00<00:00, 2279.22it/s, Materializing param=text_model.encoder.layers.2.self_attn.out_proj.weight]
Loading weights:  12%|#2        | 48/398 [00:00<00:00, 2307.47it/s, Materializing param=text_model.encoder.layers.2.self_attn.q_proj.bias]    
Loading weights:  12%|#2        | 48/398 [00:00<00:00, 2288.19it/s, Materializing param=text_model.encoder.layers.2.self_attn.q_proj.bias]
Loading weights:  12%|#2        | 49/398 [00:00<00:00, 2308.60it/s, Materializing param=text_model.encoder.layers.2.self_attn.q_proj.weight]
Loading weights:  12%|#2        | 49/398 [00:00<00:00, 2291.23it/s, Materializing param=text_model.encoder.layers.2.self_attn.q_proj.weight]
Loading weights:  13%|#2        | 50/398 [00:00<00:00, 2318.58it/s, Materializing param=text_model.encoder.layers.2.self_attn.v_proj.bias]  
Loading weights:  13%|#2        | 50/398 [00:00<00:00, 2302.41it/s, Materializing param=text_model.encoder.layers.2.self_attn.v_proj.bias]
Loading weights:  13%|#2        | 51/398 [00:00<00:00, 2323.29it/s, Materializing param=text_model.encoder.layers.2.self_attn.v_proj.weight]
Loading weights:  13%|#2        | 51/398 [00:00<00:00, 2305.48it/s, Materializing param=text_model.encoder.layers.2.self_attn.v_proj.weight]
Loading weights:  13%|#3        | 52/398 [00:00<00:00, 2332.34it/s, Materializing param=text_model.encoder.layers.3.layer_norm1.bias]       
Loading weights:  13%|#3        | 52/398 [00:00<00:00, 2320.62it/s, Materializing param=text_model.encoder.layers.3.layer_norm1.bias]
Loading weights:  13%|#3        | 53/398 [00:00<00:00, 2349.03it/s, Materializing param=text_model.encoder.layers.3.layer_norm1.weight]
Loading weights:  13%|#3        | 53/398 [00:00<00:00, 2337.45it/s, Materializing param=text_model.encoder.layers.3.layer_norm1.weight]
Loading weights:  14%|#3        | 54/398 [00:00<00:00, 2366.07it/s, Materializing param=text_model.encoder.layers.3.layer_norm2.bias]  
Loading weights:  14%|#3        | 54/398 [00:00<00:00, 2348.68it/s, Materializing param=text_model.encoder.layers.3.layer_norm2.bias]
Loading weights:  14%|#3        | 55/398 [00:00<00:00, 2363.52it/s, Materializing param=text_model.encoder.layers.3.layer_norm2.weight]
Loading weights:  14%|#3        | 55/398 [00:00<00:00, 2346.31it/s, Materializing param=text_model.encoder.layers.3.layer_norm2.weight]
Loading weights:  14%|#4        | 56/398 [00:00<00:00, 2370.50it/s, Materializing param=text_model.encoder.layers.3.mlp.fc1.bias]      
Loading weights:  14%|#4        | 56/398 [00:00<00:00, 2358.62it/s, Materializing param=text_model.encoder.layers.3.mlp.fc1.bias]
Loading weights:  14%|#4        | 57/398 [00:00<00:00, 2382.75it/s, Materializing param=text_model.encoder.layers.3.mlp.fc1.weight]
Loading weights:  14%|#4        | 57/398 [00:00<00:00, 2371.61it/s, Materializing param=text_model.encoder.layers.3.mlp.fc1.weight]
Loading weights:  15%|#4        | 58/398 [00:00<00:00, 2394.65it/s, Materializing param=text_model.encoder.layers.3.mlp.fc2.bias]  
Loading weights:  15%|#4        | 58/398 [00:00<00:00, 2375.82it/s, Materializing param=text_model.encoder.layers.3.mlp.fc2.bias]
Loading weights:  15%|#4        | 59/398 [00:00<00:00, 2389.25it/s, Materializing param=text_model.encoder.layers.3.mlp.fc2.weight]
Loading weights:  15%|#4        | 59/398 [00:00<00:00, 2372.14it/s, Materializing param=text_model.encoder.layers.3.mlp.fc2.weight]
Loading weights:  15%|#5        | 60/398 [00:00<00:00, 2381.44it/s, Materializing param=text_model.encoder.layers.3.self_attn.k_proj.bias]
Loading weights:  15%|#5        | 60/398 [00:00<00:00, 2362.61it/s, Materializing param=text_model.encoder.layers.3.self_attn.k_proj.bias]
Loading weights:  15%|#5        | 61/398 [00:00<00:00, 2371.88it/s, Materializing param=text_model.encoder.layers.3.self_attn.k_proj.weight]
Loading weights:  15%|#5        | 61/398 [00:00<00:00, 2356.96it/s, Materializing param=text_model.encoder.layers.3.self_attn.k_proj.weight]
Loading weights:  16%|#5        | 62/398 [00:00<00:00, 2363.55it/s, Materializing param=text_model.encoder.layers.3.self_attn.out_proj.bias]
Loading weights:  16%|#5        | 62/398 [00:00<00:00, 2342.41it/s, Materializing param=text_model.encoder.layers.3.self_attn.out_proj.bias]
Loading weights:  16%|#5        | 63/398 [00:00<00:00, 2358.83it/s, Materializing param=text_model.encoder.layers.3.self_attn.out_proj.weight]
Loading weights:  16%|#5        | 63/398 [00:00<00:00, 2323.59it/s, Materializing param=text_model.encoder.layers.3.self_attn.out_proj.weight]
Loading weights:  16%|#6        | 64/398 [00:00<00:00, 2321.22it/s, Materializing param=text_model.encoder.layers.3.self_attn.q_proj.bias]    
Loading weights:  16%|#6        | 64/398 [00:00<00:00, 2302.47it/s, Materializing param=text_model.encoder.layers.3.self_attn.q_proj.bias]
Loading weights:  16%|#6        | 65/398 [00:00<00:00, 2309.36it/s, Materializing param=text_model.encoder.layers.3.self_attn.q_proj.weight]
Loading weights:  16%|#6        | 65/398 [00:00<00:00, 2291.04it/s, Materializing param=text_model.encoder.layers.3.self_attn.q_proj.weight]
Loading weights:  17%|#6        | 66/398 [00:00<00:00, 2302.26it/s, Materializing param=text_model.encoder.layers.3.self_attn.v_proj.bias]  
Loading weights:  17%|#6        | 66/398 [00:00<00:00, 2284.40it/s, Materializing param=text_model.encoder.layers.3.self_attn.v_proj.bias]
Loading weights:  17%|#6        | 67/398 [00:00<00:00, 2299.41it/s, Materializing param=text_model.encoder.layers.3.self_attn.v_proj.weight]
Loading weights:  17%|#6        | 67/398 [00:00<00:00, 2285.61it/s, Materializing param=text_model.encoder.layers.3.self_attn.v_proj.weight]
Loading weights:  17%|#7        | 68/398 [00:00<00:00, 2300.97it/s, Materializing param=text_model.encoder.layers.4.layer_norm1.bias]       
Loading weights:  17%|#7        | 68/398 [00:00<00:00, 2287.83it/s, Materializing param=text_model.encoder.layers.4.layer_norm1.bias]
Loading weights:  17%|#7        | 69/398 [00:00<00:00, 2304.21it/s, Materializing param=text_model.encoder.layers.4.layer_norm1.weight]
Loading weights:  17%|#7        | 69/398 [00:00<00:00, 2291.28it/s, Materializing param=text_model.encoder.layers.4.layer_norm1.weight]
Loading weights:  18%|#7        | 70/398 [00:00<00:00, 2306.74it/s, Materializing param=text_model.encoder.layers.4.layer_norm2.bias]  
Loading weights:  18%|#7        | 70/398 [00:00<00:00, 2293.38it/s, Materializing param=text_model.encoder.layers.4.layer_norm2.bias]
Loading weights:  18%|#7        | 71/398 [00:00<00:00, 2308.96it/s, Materializing param=text_model.encoder.layers.4.layer_norm2.weight]
Loading weights:  18%|#7        | 71/398 [00:00<00:00, 2297.38it/s, Materializing param=text_model.encoder.layers.4.layer_norm2.weight]
Loading weights:  18%|#8        | 72/398 [00:00<00:00, 2315.94it/s, Materializing param=text_model.encoder.layers.4.mlp.fc1.bias]      
Loading weights:  18%|#8        | 72/398 [00:00<00:00, 2304.53it/s, Materializing param=text_model.encoder.layers.4.mlp.fc1.bias]
Loading weights:  18%|#8        | 73/398 [00:00<00:00, 2322.53it/s, Materializing param=text_model.encoder.layers.4.mlp.fc1.weight]
Loading weights:  18%|#8        | 73/398 [00:00<00:00, 2312.99it/s, Materializing param=text_model.encoder.layers.4.mlp.fc1.weight]
Loading weights:  19%|#8        | 74/398 [00:00<00:00, 2330.38it/s, Materializing param=text_model.encoder.layers.4.mlp.fc2.bias]  
Loading weights:  19%|#8        | 74/398 [00:00<00:00, 2321.16it/s, Materializing param=text_model.encoder.layers.4.mlp.fc2.bias]
Loading weights:  19%|#8        | 75/398 [00:00<00:00, 2335.27it/s, Materializing param=text_model.encoder.layers.4.mlp.fc2.weight]
Loading weights:  19%|#8        | 75/398 [00:00<00:00, 2323.53it/s, Materializing param=text_model.encoder.layers.4.mlp.fc2.weight]
Loading weights:  19%|#9        | 76/398 [00:00<00:00, 2338.34it/s, Materializing param=text_model.encoder.layers.4.self_attn.k_proj.bias]
Loading weights:  19%|#9        | 76/398 [00:00<00:00, 2325.34it/s, Materializing param=text_model.encoder.layers.4.self_attn.k_proj.bias]
Loading weights:  19%|#9        | 77/398 [00:00<00:00, 2340.55it/s, Materializing param=text_model.encoder.layers.4.self_attn.k_proj.weight]
Loading weights:  19%|#9        | 77/398 [00:00<00:00, 2331.41it/s, Materializing param=text_model.encoder.layers.4.self_attn.k_proj.weight]
Loading weights:  20%|#9        | 78/398 [00:00<00:00, 2346.56it/s, Materializing param=text_model.encoder.layers.4.self_attn.out_proj.bias]
Loading weights:  20%|#9        | 78/398 [00:00<00:00, 2333.73it/s, Materializing param=text_model.encoder.layers.4.self_attn.out_proj.bias]
Loading weights:  20%|#9        | 79/398 [00:00<00:00, 2347.09it/s, Materializing param=text_model.encoder.layers.4.self_attn.out_proj.weight]
Loading weights:  20%|#9        | 79/398 [00:00<00:00, 2337.04it/s, Materializing param=text_model.encoder.layers.4.self_attn.out_proj.weight]
Loading weights:  20%|##        | 80/398 [00:00<00:00, 2353.64it/s, Materializing param=text_model.encoder.layers.4.self_attn.q_proj.bias]    
Loading weights:  20%|##        | 80/398 [00:00<00:00, 2345.33it/s, Materializing param=text_model.encoder.layers.4.self_attn.q_proj.bias]
Loading weights:  20%|##        | 81/398 [00:00<00:00, 2363.55it/s, Materializing param=text_model.encoder.layers.4.self_attn.q_proj.weight]
Loading weights:  20%|##        | 81/398 [00:00<00:00, 2350.51it/s, Materializing param=text_model.encoder.layers.4.self_attn.q_proj.weight]
Loading weights:  21%|##        | 82/398 [00:00<00:00, 2362.65it/s, Materializing param=text_model.encoder.layers.4.self_attn.v_proj.bias]  
Loading weights:  21%|##        | 82/398 [00:00<00:00, 2352.02it/s, Materializing param=text_model.encoder.layers.4.self_attn.v_proj.bias]
Loading weights:  21%|##        | 83/398 [00:00<00:00, 2367.15it/s, Materializing param=text_model.encoder.layers.4.self_attn.v_proj.weight]
Loading weights:  21%|##        | 83/398 [00:00<00:00, 2357.83it/s, Materializing param=text_model.encoder.layers.4.self_attn.v_proj.weight]
Loading weights:  21%|##1       | 84/398 [00:00<00:00, 2374.44it/s, Materializing param=text_model.encoder.layers.5.layer_norm1.bias]       
Loading weights:  21%|##1       | 84/398 [00:00<00:00, 2365.81it/s, Materializing param=text_model.encoder.layers.5.layer_norm1.bias]
Loading weights:  21%|##1       | 85/398 [00:00<00:00, 2383.14it/s, Materializing param=text_model.encoder.layers.5.layer_norm1.weight]
Loading weights:  21%|##1       | 85/398 [00:00<00:00, 2375.47it/s, Materializing param=text_model.encoder.layers.5.layer_norm1.weight]
Loading weights:  22%|##1       | 86/398 [00:00<00:00, 2393.17it/s, Materializing param=text_model.encoder.layers.5.layer_norm2.bias]  
Loading weights:  22%|##1       | 86/398 [00:00<00:00, 2385.84it/s, Materializing param=text_model.encoder.layers.5.layer_norm2.bias]
Loading weights:  22%|##1       | 87/398 [00:00<00:00, 2400.25it/s, Materializing param=text_model.encoder.layers.5.layer_norm2.weight]
Loading weights:  22%|##1       | 87/398 [00:00<00:00, 2389.10it/s, Materializing param=text_model.encoder.layers.5.layer_norm2.weight]
Loading weights:  22%|##2       | 88/398 [00:00<00:00, 2398.18it/s, Materializing param=text_model.encoder.layers.5.mlp.fc1.bias]      
Loading weights:  22%|##2       | 88/398 [00:00<00:00, 2384.61it/s, Materializing param=text_model.encoder.layers.5.mlp.fc1.bias]
Loading weights:  22%|##2       | 89/398 [00:00<00:00, 2393.27it/s, Materializing param=text_model.encoder.layers.5.mlp.fc1.weight]
Loading weights:  22%|##2       | 89/398 [00:00<00:00, 2383.31it/s, Materializing param=text_model.encoder.layers.5.mlp.fc1.weight]
Loading weights:  23%|##2       | 90/398 [00:00<00:00, 2398.60it/s, Materializing param=text_model.encoder.layers.5.mlp.fc2.bias]  
Loading weights:  23%|##2       | 90/398 [00:00<00:00, 2389.75it/s, Materializing param=text_model.encoder.layers.5.mlp.fc2.bias]
Loading weights:  23%|##2       | 91/398 [00:00<00:00, 2401.59it/s, Materializing param=text_model.encoder.layers.5.mlp.fc2.weight]
Loading weights:  23%|##2       | 91/398 [00:00<00:00, 2388.59it/s, Materializing param=text_model.encoder.layers.5.mlp.fc2.weight]
Loading weights:  23%|##3       | 92/398 [00:00<00:00, 2399.25it/s, Materializing param=text_model.encoder.layers.5.self_attn.k_proj.bias]
Loading weights:  23%|##3       | 92/398 [00:00<00:00, 2389.40it/s, Materializing param=text_model.encoder.layers.5.self_attn.k_proj.bias]
Loading weights:  23%|##3       | 93/398 [00:00<00:00, 2403.55it/s, Materializing param=text_model.encoder.layers.5.self_attn.k_proj.weight]
Loading weights:  23%|##3       | 93/398 [00:00<00:00, 2395.51it/s, Materializing param=text_model.encoder.layers.5.self_attn.k_proj.weight]
Loading weights:  24%|##3       | 94/398 [00:00<00:00, 2406.11it/s, Materializing param=text_model.encoder.layers.5.self_attn.out_proj.bias]
Loading weights:  24%|##3       | 94/398 [00:00<00:00, 2396.29it/s, Materializing param=text_model.encoder.layers.5.self_attn.out_proj.bias]
Loading weights:  24%|##3       | 95/398 [00:00<00:00, 2409.91it/s, Materializing param=text_model.encoder.layers.5.self_attn.out_proj.weight]
Loading weights:  24%|##3       | 95/398 [00:00<00:00, 2402.06it/s, Materializing param=text_model.encoder.layers.5.self_attn.out_proj.weight]
Loading weights:  24%|##4       | 96/398 [00:00<00:00, 2416.99it/s, Materializing param=text_model.encoder.layers.5.self_attn.q_proj.bias]    
Loading weights:  24%|##4       | 96/398 [00:00<00:00, 2408.13it/s, Materializing param=text_model.encoder.layers.5.self_attn.q_proj.bias]
Loading weights:  24%|##4       | 97/398 [00:00<00:00, 2422.24it/s, Materializing param=text_model.encoder.layers.5.self_attn.q_proj.weight]
Loading weights:  24%|##4       | 97/398 [00:00<00:00, 2414.94it/s, Materializing param=text_model.encoder.layers.5.self_attn.q_proj.weight]
Loading weights:  25%|##4       | 98/398 [00:00<00:00, 2430.22it/s, Materializing param=text_model.encoder.layers.5.self_attn.v_proj.bias]  
Loading weights:  25%|##4       | 98/398 [00:00<00:00, 2422.92it/s, Materializing param=text_model.encoder.layers.5.self_attn.v_proj.bias]
Loading weights:  25%|##4       | 99/398 [00:00<00:00, 2438.26it/s, Materializing param=text_model.encoder.layers.5.self_attn.v_proj.weight]
Loading weights:  25%|##4       | 99/398 [00:00<00:00, 2431.07it/s, Materializing param=text_model.encoder.layers.5.self_attn.v_proj.weight]
Loading weights:  25%|##5       | 100/398 [00:00<00:00, 2446.13it/s, Materializing param=text_model.encoder.layers.6.layer_norm1.bias]      
Loading weights:  25%|##5       | 100/398 [00:00<00:00, 2439.30it/s, Materializing param=text_model.encoder.layers.6.layer_norm1.bias]
Loading weights:  25%|##5       | 101/398 [00:00<00:00, 2451.66it/s, Materializing param=text_model.encoder.layers.6.layer_norm1.weight]
Loading weights:  25%|##5       | 101/398 [00:00<00:00, 2441.87it/s, Materializing param=text_model.encoder.layers.6.layer_norm1.weight]
Loading weights:  26%|##5       | 102/398 [00:00<00:00, 2452.06it/s, Materializing param=text_model.encoder.layers.6.layer_norm2.bias]  
Loading weights:  26%|##5       | 102/398 [00:00<00:00, 2442.66it/s, Materializing param=text_model.encoder.layers.6.layer_norm2.bias]
Loading weights:  26%|##5       | 103/398 [00:00<00:00, 2451.92it/s, Materializing param=text_model.encoder.layers.6.layer_norm2.weight]
Loading weights:  26%|##5       | 103/398 [00:00<00:00, 2442.42it/s, Materializing param=text_model.encoder.layers.6.layer_norm2.weight]
Loading weights:  26%|##6       | 104/398 [00:00<00:00, 2452.06it/s, Materializing param=text_model.encoder.layers.6.mlp.fc1.bias]      
Loading weights:  26%|##6       | 104/398 [00:00<00:00, 2443.19it/s, Materializing param=text_model.encoder.layers.6.mlp.fc1.bias]
Loading weights:  26%|##6       | 105/398 [00:00<00:00, 2455.11it/s, Materializing param=text_model.encoder.layers.6.mlp.fc1.weight]
Loading weights:  26%|##6       | 105/398 [00:00<00:00, 2448.17it/s, Materializing param=text_model.encoder.layers.6.mlp.fc1.weight]
Loading weights:  27%|##6       | 106/398 [00:00<00:00, 2461.34it/s, Materializing param=text_model.encoder.layers.6.mlp.fc2.bias]  
Loading weights:  27%|##6       | 106/398 [00:00<00:00, 2454.37it/s, Materializing param=text_model.encoder.layers.6.mlp.fc2.bias]
Loading weights:  27%|##6       | 107/398 [00:00<00:00, 2465.23it/s, Materializing param=text_model.encoder.layers.6.mlp.fc2.weight]
Loading weights:  27%|##6       | 107/398 [00:00<00:00, 2456.19it/s, Materializing param=text_model.encoder.layers.6.mlp.fc2.weight]
Loading weights:  27%|##7       | 108/398 [00:00<00:00, 2468.08it/s, Materializing param=text_model.encoder.layers.6.self_attn.k_proj.bias]
Loading weights:  27%|##7       | 108/398 [00:00<00:00, 2455.51it/s, Materializing param=text_model.encoder.layers.6.self_attn.k_proj.bias]
Loading weights:  27%|##7       | 109/398 [00:00<00:00, 2462.36it/s, Materializing param=text_model.encoder.layers.6.self_attn.k_proj.weight]
Loading weights:  27%|##7       | 109/398 [00:00<00:00, 2451.53it/s, Materializing param=text_model.encoder.layers.6.self_attn.k_proj.weight]
Loading weights:  28%|##7       | 110/398 [00:00<00:00, 2456.53it/s, Materializing param=text_model.encoder.layers.6.self_attn.out_proj.bias]
Loading weights:  28%|##7       | 110/398 [00:00<00:00, 2442.28it/s, Materializing param=text_model.encoder.layers.6.self_attn.out_proj.bias]
Loading weights:  28%|##7       | 111/398 [00:00<00:00, 2447.97it/s, Materializing param=text_model.encoder.layers.6.self_attn.out_proj.weight]
Loading weights:  28%|##7       | 111/398 [00:00<00:00, 2440.25it/s, Materializing param=text_model.encoder.layers.6.self_attn.out_proj.weight]
Loading weights:  28%|##8       | 112/398 [00:00<00:00, 2452.49it/s, Materializing param=text_model.encoder.layers.6.self_attn.q_proj.bias]    
Loading weights:  28%|##8       | 112/398 [00:00<00:00, 2441.82it/s, Materializing param=text_model.encoder.layers.6.self_attn.q_proj.bias]
Loading weights:  28%|##8       | 113/398 [00:00<00:00, 2449.80it/s, Materializing param=text_model.encoder.layers.6.self_attn.q_proj.weight]
Loading weights:  28%|##8       | 113/398 [00:00<00:00, 2438.06it/s, Materializing param=text_model.encoder.layers.6.self_attn.q_proj.weight]
Loading weights:  29%|##8       | 114/398 [00:00<00:00, 2443.48it/s, Materializing param=text_model.encoder.layers.6.self_attn.v_proj.bias]  
Loading weights:  29%|##8       | 114/398 [00:00<00:00, 2430.78it/s, Materializing param=text_model.encoder.layers.6.self_attn.v_proj.bias]
Loading weights:  29%|##8       | 115/398 [00:00<00:00, 2438.66it/s, Materializing param=text_model.encoder.layers.6.self_attn.v_proj.weight]
Loading weights:  29%|##8       | 115/398 [00:00<00:00, 2431.01it/s, Materializing param=text_model.encoder.layers.6.self_attn.v_proj.weight]
Loading weights:  29%|##9       | 116/398 [00:00<00:00, 2439.91it/s, Materializing param=text_model.encoder.layers.7.layer_norm1.bias]       
Loading weights:  29%|##9       | 116/398 [00:00<00:00, 2433.24it/s, Materializing param=text_model.encoder.layers.7.layer_norm1.bias]
Loading weights:  29%|##9       | 117/398 [00:00<00:00, 2442.44it/s, Materializing param=text_model.encoder.layers.7.layer_norm1.weight]
Loading weights:  29%|##9       | 117/398 [00:00<00:00, 2426.62it/s, Materializing param=text_model.encoder.layers.7.layer_norm1.weight]
Loading weights:  30%|##9       | 118/398 [00:00<00:00, 2433.88it/s, Materializing param=text_model.encoder.layers.7.layer_norm2.bias]  
Loading weights:  30%|##9       | 118/398 [00:00<00:00, 2425.62it/s, Materializing param=text_model.encoder.layers.7.layer_norm2.bias]
Loading weights:  30%|##9       | 119/398 [00:00<00:00, 2437.01it/s, Materializing param=text_model.encoder.layers.7.layer_norm2.weight]
Loading weights:  30%|##9       | 119/398 [00:00<00:00, 2430.90it/s, Materializing param=text_model.encoder.layers.7.layer_norm2.weight]
Loading weights:  30%|###       | 120/398 [00:00<00:00, 2443.21it/s, Materializing param=text_model.encoder.layers.7.mlp.fc1.bias]      
Loading weights:  30%|###       | 120/398 [00:00<00:00, 2434.87it/s, Materializing param=text_model.encoder.layers.7.mlp.fc1.bias]
Loading weights:  30%|###       | 121/398 [00:00<00:00, 2446.81it/s, Materializing param=text_model.encoder.layers.7.mlp.fc1.weight]
Loading weights:  30%|###       | 121/398 [00:00<00:00, 2441.11it/s, Materializing param=text_model.encoder.layers.7.mlp.fc1.weight]
Loading weights:  31%|###       | 122/398 [00:00<00:00, 2453.36it/s, Materializing param=text_model.encoder.layers.7.mlp.fc2.bias]  
Loading weights:  31%|###       | 122/398 [00:00<00:00, 2447.28it/s, Materializing param=text_model.encoder.layers.7.mlp.fc2.bias]
Loading weights:  31%|###       | 123/398 [00:00<00:00, 2459.29it/s, Materializing param=text_model.encoder.layers.7.mlp.fc2.weight]
Loading weights:  31%|###       | 123/398 [00:00<00:00, 2450.75it/s, Materializing param=text_model.encoder.layers.7.mlp.fc2.weight]
Loading weights:  31%|###1      | 124/398 [00:00<00:00, 2457.11it/s, Materializing param=text_model.encoder.layers.7.self_attn.k_proj.bias]
Loading weights:  31%|###1      | 124/398 [00:00<00:00, 2450.54it/s, Materializing param=text_model.encoder.layers.7.self_attn.k_proj.bias]
Loading weights:  31%|###1      | 125/398 [00:00<00:00, 2461.94it/s, Materializing param=text_model.encoder.layers.7.self_attn.k_proj.weight]
Loading weights:  31%|###1      | 125/398 [00:00<00:00, 2453.69it/s, Materializing param=text_model.encoder.layers.7.self_attn.k_proj.weight]
Loading weights:  32%|###1      | 126/398 [00:00<00:00, 2464.98it/s, Materializing param=text_model.encoder.layers.7.self_attn.out_proj.bias]
Loading weights:  32%|###1      | 126/398 [00:00<00:00, 2456.97it/s, Materializing param=text_model.encoder.layers.7.self_attn.out_proj.bias]
Loading weights:  32%|###1      | 127/398 [00:00<00:00, 2462.65it/s, Materializing param=text_model.encoder.layers.7.self_attn.out_proj.weight]
Loading weights:  32%|###1      | 127/398 [00:00<00:00, 2452.99it/s, Materializing param=text_model.encoder.layers.7.self_attn.out_proj.weight]
Loading weights:  32%|###2      | 128/398 [00:00<00:00, 2461.74it/s, Materializing param=text_model.encoder.layers.7.self_attn.q_proj.bias]    
Loading weights:  32%|###2      | 128/398 [00:00<00:00, 2454.01it/s, Materializing param=text_model.encoder.layers.7.self_attn.q_proj.bias]
Loading weights:  32%|###2      | 129/398 [00:00<00:00, 2465.29it/s, Materializing param=text_model.encoder.layers.7.self_attn.q_proj.weight]
Loading weights:  32%|###2      | 129/398 [00:00<00:00, 2456.68it/s, Materializing param=text_model.encoder.layers.7.self_attn.q_proj.weight]
Loading weights:  33%|###2      | 130/398 [00:00<00:00, 2461.39it/s, Materializing param=text_model.encoder.layers.7.self_attn.v_proj.bias]  
Loading weights:  33%|###2      | 130/398 [00:00<00:00, 2452.91it/s, Materializing param=text_model.encoder.layers.7.self_attn.v_proj.bias]
Loading weights:  33%|###2      | 131/398 [00:00<00:00, 2462.91it/s, Materializing param=text_model.encoder.layers.7.self_attn.v_proj.weight]
Loading weights:  33%|###2      | 131/398 [00:00<00:00, 2457.23it/s, Materializing param=text_model.encoder.layers.7.self_attn.v_proj.weight]
Loading weights:  33%|###3      | 132/398 [00:00<00:00, 2468.22it/s, Materializing param=text_model.encoder.layers.8.layer_norm1.bias]       
Loading weights:  33%|###3      | 132/398 [00:00<00:00, 2462.98it/s, Materializing param=text_model.encoder.layers.8.layer_norm1.bias]
Loading weights:  33%|###3      | 133/398 [00:00<00:00, 2474.39it/s, Materializing param=text_model.encoder.layers.8.layer_norm1.weight]
Loading weights:  33%|###3      | 133/398 [00:00<00:00, 2469.31it/s, Materializing param=text_model.encoder.layers.8.layer_norm1.weight]
Loading weights:  34%|###3      | 134/398 [00:00<00:00, 2481.16it/s, Materializing param=text_model.encoder.layers.8.layer_norm2.bias]  
Loading weights:  34%|###3      | 134/398 [00:00<00:00, 2475.82it/s, Materializing param=text_model.encoder.layers.8.layer_norm2.bias]
Loading weights:  34%|###3      | 135/398 [00:00<00:00, 2484.30it/s, Materializing param=text_model.encoder.layers.8.layer_norm2.weight]
Loading weights:  34%|###3      | 135/398 [00:00<00:00, 2471.09it/s, Materializing param=text_model.encoder.layers.8.layer_norm2.weight]
Loading weights:  34%|###4      | 136/398 [00:00<00:00, 2481.08it/s, Materializing param=text_model.encoder.layers.8.mlp.fc1.bias]      
Loading weights:  34%|###4      | 136/398 [00:00<00:00, 2475.62it/s, Materializing param=text_model.encoder.layers.8.mlp.fc1.bias]
Loading weights:  34%|###4      | 137/398 [00:00<00:00, 2485.87it/s, Materializing param=text_model.encoder.layers.8.mlp.fc1.weight]
Loading weights:  34%|###4      | 137/398 [00:00<00:00, 2480.75it/s, Materializing param=text_model.encoder.layers.8.mlp.fc1.weight]
Loading weights:  35%|###4      | 138/398 [00:00<00:00, 2489.45it/s, Materializing param=text_model.encoder.layers.8.mlp.fc2.bias]  
Loading weights:  35%|###4      | 138/398 [00:00<00:00, 2481.90it/s, Materializing param=text_model.encoder.layers.8.mlp.fc2.bias]
Loading weights:  35%|###4      | 139/398 [00:00<00:00, 2488.39it/s, Materializing param=text_model.encoder.layers.8.mlp.fc2.weight]
Loading weights:  35%|###4      | 139/398 [00:00<00:00, 2479.01it/s, Materializing param=text_model.encoder.layers.8.mlp.fc2.weight]
Loading weights:  35%|###5      | 140/398 [00:00<00:00, 2486.38it/s, Materializing param=text_model.encoder.layers.8.self_attn.k_proj.bias]
Loading weights:  35%|###5      | 140/398 [00:00<00:00, 2478.87it/s, Materializing param=text_model.encoder.layers.8.self_attn.k_proj.bias]
Loading weights:  35%|###5      | 141/398 [00:00<00:00, 2488.24it/s, Materializing param=text_model.encoder.layers.8.self_attn.k_proj.weight]
Loading weights:  35%|###5      | 141/398 [00:00<00:00, 2482.15it/s, Materializing param=text_model.encoder.layers.8.self_attn.k_proj.weight]
Loading weights:  36%|###5      | 142/398 [00:00<00:00, 2491.04it/s, Materializing param=text_model.encoder.layers.8.self_attn.out_proj.bias]
Loading weights:  36%|###5      | 142/398 [00:00<00:00, 2483.97it/s, Materializing param=text_model.encoder.layers.8.self_attn.out_proj.bias]
Loading weights:  36%|###5      | 143/398 [00:00<00:00, 2493.20it/s, Materializing param=text_model.encoder.layers.8.self_attn.out_proj.weight]
Loading weights:  36%|###5      | 143/398 [00:00<00:00, 2485.76it/s, Materializing param=text_model.encoder.layers.8.self_attn.out_proj.weight]
Loading weights:  36%|###6      | 144/398 [00:00<00:00, 2486.57it/s, Materializing param=text_model.encoder.layers.8.self_attn.q_proj.bias]    
Loading weights:  36%|###6      | 144/398 [00:00<00:00, 2477.83it/s, Materializing param=text_model.encoder.layers.8.self_attn.q_proj.bias]
Loading weights:  36%|###6      | 145/398 [00:00<00:00, 2484.16it/s, Materializing param=text_model.encoder.layers.8.self_attn.q_proj.weight]
Loading weights:  36%|###6      | 145/398 [00:00<00:00, 2476.32it/s, Materializing param=text_model.encoder.layers.8.self_attn.q_proj.weight]
Loading weights:  37%|###6      | 146/398 [00:00<00:00, 2484.89it/s, Materializing param=text_model.encoder.layers.8.self_attn.v_proj.bias]  
Loading weights:  37%|###6      | 146/398 [00:00<00:00, 2479.17it/s, Materializing param=text_model.encoder.layers.8.self_attn.v_proj.bias]
Loading weights:  37%|###6      | 147/398 [00:00<00:00, 2488.57it/s, Materializing param=text_model.encoder.layers.8.self_attn.v_proj.weight]
Loading weights:  37%|###6      | 147/398 [00:00<00:00, 2483.10it/s, Materializing param=text_model.encoder.layers.8.self_attn.v_proj.weight]
Loading weights:  37%|###7      | 148/398 [00:00<00:00, 2492.92it/s, Materializing param=text_model.encoder.layers.9.layer_norm1.bias]       
Loading weights:  37%|###7      | 148/398 [00:00<00:00, 2487.82it/s, Materializing param=text_model.encoder.layers.9.layer_norm1.bias]
Loading weights:  37%|###7      | 149/398 [00:00<00:00, 2497.35it/s, Materializing param=text_model.encoder.layers.9.layer_norm1.weight]
Loading weights:  37%|###7      | 149/398 [00:00<00:00, 2491.93it/s, Materializing param=text_model.encoder.layers.9.layer_norm1.weight]
Loading weights:  38%|###7      | 150/398 [00:00<00:00, 2499.04it/s, Materializing param=text_model.encoder.layers.9.layer_norm2.bias]  
Loading weights:  38%|###7      | 150/398 [00:00<00:00, 2493.66it/s, Materializing param=text_model.encoder.layers.9.layer_norm2.bias]
Loading weights:  38%|###7      | 151/398 [00:00<00:00, 2499.64it/s, Materializing param=text_model.encoder.layers.9.layer_norm2.weight]
Loading weights:  38%|###7      | 151/398 [00:00<00:00, 2489.90it/s, Materializing param=text_model.encoder.layers.9.layer_norm2.weight]
Loading weights:  38%|###8      | 152/398 [00:00<00:00, 2492.50it/s, Materializing param=text_model.encoder.layers.9.mlp.fc1.bias]      
Loading weights:  38%|###8      | 152/398 [00:00<00:00, 2482.47it/s, Materializing param=text_model.encoder.layers.9.mlp.fc1.bias]
Loading weights:  38%|###8      | 153/398 [00:00<00:00, 2485.98it/s, Materializing param=text_model.encoder.layers.9.mlp.fc1.weight]
Loading weights:  38%|###8      | 153/398 [00:00<00:00, 2479.15it/s, Materializing param=text_model.encoder.layers.9.mlp.fc1.weight]
Loading weights:  39%|###8      | 154/398 [00:00<00:00, 2485.26it/s, Materializing param=text_model.encoder.layers.9.mlp.fc2.bias]  
Loading weights:  39%|###8      | 154/398 [00:00<00:00, 2478.93it/s, Materializing param=text_model.encoder.layers.9.mlp.fc2.bias]
Loading weights:  39%|###8      | 155/398 [00:00<00:00, 2487.53it/s, Materializing param=text_model.encoder.layers.9.mlp.fc2.weight]
Loading weights:  39%|###8      | 155/398 [00:00<00:00, 2482.22it/s, Materializing param=text_model.encoder.layers.9.mlp.fc2.weight]
Loading weights:  39%|###9      | 156/398 [00:00<00:00, 2488.00it/s, Materializing param=text_model.encoder.layers.9.self_attn.k_proj.bias]
Loading weights:  39%|###9      | 156/398 [00:00<00:00, 2479.24it/s, Materializing param=text_model.encoder.layers.9.self_attn.k_proj.bias]
Loading weights:  39%|###9      | 157/398 [00:00<00:00, 2485.35it/s, Materializing param=text_model.encoder.layers.9.self_attn.k_proj.weight]
Loading weights:  39%|###9      | 157/398 [00:00<00:00, 2479.10it/s, Materializing param=text_model.encoder.layers.9.self_attn.k_proj.weight]
Loading weights:  40%|###9      | 158/398 [00:00<00:00, 2487.12it/s, Materializing param=text_model.encoder.layers.9.self_attn.out_proj.bias]
Loading weights:  40%|###9      | 158/398 [00:00<00:00, 2482.35it/s, Materializing param=text_model.encoder.layers.9.self_attn.out_proj.bias]
Loading weights:  40%|###9      | 159/398 [00:00<00:00, 2491.65it/s, Materializing param=text_model.encoder.layers.9.self_attn.out_proj.weight]
Loading weights:  40%|###9      | 159/398 [00:00<00:00, 2486.80it/s, Materializing param=text_model.encoder.layers.9.self_attn.out_proj.weight]
Loading weights:  40%|####      | 160/398 [00:00<00:00, 2495.45it/s, Materializing param=text_model.encoder.layers.9.self_attn.q_proj.bias]    
Loading weights:  40%|####      | 160/398 [00:00<00:00, 2490.62it/s, Materializing param=text_model.encoder.layers.9.self_attn.q_proj.bias]
Loading weights:  40%|####      | 161/398 [00:00<00:00, 2499.84it/s, Materializing param=text_model.encoder.layers.9.self_attn.q_proj.weight]
Loading weights:  40%|####      | 161/398 [00:00<00:00, 2493.75it/s, Materializing param=text_model.encoder.layers.9.self_attn.q_proj.weight]
Loading weights:  41%|####      | 162/398 [00:00<00:00, 2499.32it/s, Materializing param=text_model.encoder.layers.9.self_attn.v_proj.bias]  
Loading weights:  41%|####      | 162/398 [00:00<00:00, 2493.39it/s, Materializing param=text_model.encoder.layers.9.self_attn.v_proj.bias]
Loading weights:  41%|####      | 163/398 [00:00<00:00, 2501.40it/s, Materializing param=text_model.encoder.layers.9.self_attn.v_proj.weight]
Loading weights:  41%|####      | 163/398 [00:00<00:00, 2496.58it/s, Materializing param=text_model.encoder.layers.9.self_attn.v_proj.weight]
Loading weights:  41%|####1     | 164/398 [00:00<00:00, 2505.85it/s, Materializing param=text_model.encoder.layers.10.layer_norm1.bias]      
Loading weights:  41%|####1     | 164/398 [00:00<00:00, 2500.34it/s, Materializing param=text_model.encoder.layers.10.layer_norm1.bias]
Loading weights:  41%|####1     | 165/398 [00:00<00:00, 2501.18it/s, Materializing param=text_model.encoder.layers.10.layer_norm1.weight]
Loading weights:  41%|####1     | 165/398 [00:00<00:00, 2493.25it/s, Materializing param=text_model.encoder.layers.10.layer_norm1.weight]
Loading weights:  42%|####1     | 166/398 [00:00<00:00, 2497.50it/s, Materializing param=text_model.encoder.layers.10.layer_norm2.bias]  
Loading weights:  42%|####1     | 166/398 [00:00<00:00, 2490.84it/s, Materializing param=text_model.encoder.layers.10.layer_norm2.bias]
Loading weights:  42%|####1     | 167/398 [00:00<00:00, 2498.20it/s, Materializing param=text_model.encoder.layers.10.layer_norm2.weight]
Loading weights:  42%|####1     | 167/398 [00:00<00:00, 2493.44it/s, Materializing param=text_model.encoder.layers.10.layer_norm2.weight]
Loading weights:  42%|####2     | 168/398 [00:00<00:00, 2502.38it/s, Materializing param=text_model.encoder.layers.10.mlp.fc1.bias]      
Loading weights:  42%|####2     | 168/398 [00:00<00:00, 2497.94it/s, Materializing param=text_model.encoder.layers.10.mlp.fc1.bias]
Loading weights:  42%|####2     | 169/398 [00:00<00:00, 2502.89it/s, Materializing param=text_model.encoder.layers.10.mlp.fc1.weight]
Loading weights:  42%|####2     | 169/398 [00:00<00:00, 2493.71it/s, Materializing param=text_model.encoder.layers.10.mlp.fc1.weight]
Loading weights:  43%|####2     | 170/398 [00:00<00:00, 2500.16it/s, Materializing param=text_model.encoder.layers.10.mlp.fc2.bias]  
Loading weights:  43%|####2     | 170/398 [00:00<00:00, 2495.46it/s, Materializing param=text_model.encoder.layers.10.mlp.fc2.bias]
Loading weights:  43%|####2     | 171/398 [00:00<00:00, 2504.23it/s, Materializing param=text_model.encoder.layers.10.mlp.fc2.weight]
Loading weights:  43%|####2     | 171/398 [00:00<00:00, 2500.13it/s, Materializing param=text_model.encoder.layers.10.mlp.fc2.weight]
Loading weights:  43%|####3     | 172/398 [00:00<00:00, 2509.24it/s, Materializing param=text_model.encoder.layers.10.self_attn.k_proj.bias]
Loading weights:  43%|####3     | 172/398 [00:00<00:00, 2505.12it/s, Materializing param=text_model.encoder.layers.10.self_attn.k_proj.bias]
Loading weights:  43%|####3     | 173/398 [00:00<00:00, 2513.94it/s, Materializing param=text_model.encoder.layers.10.self_attn.k_proj.weight]
Loading weights:  43%|####3     | 173/398 [00:00<00:00, 2509.82it/s, Materializing param=text_model.encoder.layers.10.self_attn.k_proj.weight]
Loading weights:  44%|####3     | 174/398 [00:00<00:00, 2518.83it/s, Materializing param=text_model.encoder.layers.10.self_attn.out_proj.bias]
Loading weights:  44%|####3     | 174/398 [00:00<00:00, 2514.95it/s, Materializing param=text_model.encoder.layers.10.self_attn.out_proj.bias]
Loading weights:  44%|####3     | 175/398 [00:00<00:00, 2523.84it/s, Materializing param=text_model.encoder.layers.10.self_attn.out_proj.weight]
Loading weights:  44%|####3     | 175/398 [00:00<00:00, 2517.50it/s, Materializing param=text_model.encoder.layers.10.self_attn.out_proj.weight]
Loading weights:  44%|####4     | 176/398 [00:00<00:00, 2522.73it/s, Materializing param=text_model.encoder.layers.10.self_attn.q_proj.bias]    
Loading weights:  44%|####4     | 176/398 [00:00<00:00, 2516.29it/s, Materializing param=text_model.encoder.layers.10.self_attn.q_proj.bias]
Loading weights:  44%|####4     | 177/398 [00:00<00:00, 2523.68it/s, Materializing param=text_model.encoder.layers.10.self_attn.q_proj.weight]
Loading weights:  44%|####4     | 177/398 [00:00<00:00, 2518.90it/s, Materializing param=text_model.encoder.layers.10.self_attn.q_proj.weight]
Loading weights:  45%|####4     | 178/398 [00:00<00:00, 2527.28it/s, Materializing param=text_model.encoder.layers.10.self_attn.v_proj.bias]  
Loading weights:  45%|####4     | 178/398 [00:00<00:00, 2523.19it/s, Materializing param=text_model.encoder.layers.10.self_attn.v_proj.bias]
Loading weights:  45%|####4     | 179/398 [00:00<00:00, 2531.86it/s, Materializing param=text_model.encoder.layers.10.self_attn.v_proj.weight]
Loading weights:  45%|####4     | 179/398 [00:00<00:00, 2525.21it/s, Materializing param=text_model.encoder.layers.10.self_attn.v_proj.weight]
Loading weights:  45%|####5     | 180/398 [00:00<00:00, 2529.03it/s, Materializing param=text_model.encoder.layers.11.layer_norm1.bias]       
Loading weights:  45%|####5     | 180/398 [00:00<00:00, 2523.03it/s, Materializing param=text_model.encoder.layers.11.layer_norm1.bias]
Loading weights:  45%|####5     | 181/398 [00:00<00:00, 2529.83it/s, Materializing param=text_model.encoder.layers.11.layer_norm1.weight]
Loading weights:  45%|####5     | 181/398 [00:00<00:00, 2524.15it/s, Materializing param=text_model.encoder.layers.11.layer_norm1.weight]
Loading weights:  46%|####5     | 182/398 [00:00<00:00, 2531.56it/s, Materializing param=text_model.encoder.layers.11.layer_norm2.bias]  
Loading weights:  46%|####5     | 182/398 [00:00<00:00, 2526.54it/s, Materializing param=text_model.encoder.layers.11.layer_norm2.bias]
Loading weights:  46%|####5     | 183/398 [00:00<00:00, 2530.30it/s, Materializing param=text_model.encoder.layers.11.layer_norm2.weight]
Loading weights:  46%|####5     | 183/398 [00:00<00:00, 2524.74it/s, Materializing param=text_model.encoder.layers.11.layer_norm2.weight]
Loading weights:  46%|####6     | 184/398 [00:00<00:00, 2530.33it/s, Materializing param=text_model.encoder.layers.11.mlp.fc1.bias]      
Loading weights:  46%|####6     | 184/398 [00:00<00:00, 2524.23it/s, Materializing param=text_model.encoder.layers.11.mlp.fc1.bias]
Loading weights:  46%|####6     | 185/398 [00:00<00:00, 2531.21it/s, Materializing param=text_model.encoder.layers.11.mlp.fc1.weight]
Loading weights:  46%|####6     | 185/398 [00:00<00:00, 2526.93it/s, Materializing param=text_model.encoder.layers.11.mlp.fc1.weight]
Loading weights:  47%|####6     | 186/398 [00:00<00:00, 2534.94it/s, Materializing param=text_model.encoder.layers.11.mlp.fc2.bias]  
Loading weights:  47%|####6     | 186/398 [00:00<00:00, 2530.75it/s, Materializing param=text_model.encoder.layers.11.mlp.fc2.bias]
Loading weights:  47%|####6     | 187/398 [00:00<00:00, 2538.78it/s, Materializing param=text_model.encoder.layers.11.mlp.fc2.weight]
Loading weights:  47%|####6     | 187/398 [00:00<00:00, 2534.64it/s, Materializing param=text_model.encoder.layers.11.mlp.fc2.weight]
Loading weights:  47%|####7     | 188/398 [00:00<00:00, 2542.55it/s, Materializing param=text_model.encoder.layers.11.self_attn.k_proj.bias]
Loading weights:  47%|####7     | 188/398 [00:00<00:00, 2538.03it/s, Materializing param=text_model.encoder.layers.11.self_attn.k_proj.bias]
Loading weights:  47%|####7     | 189/398 [00:00<00:00, 2543.20it/s, Materializing param=text_model.encoder.layers.11.self_attn.k_proj.weight]
Loading weights:  47%|####7     | 189/398 [00:00<00:00, 2537.16it/s, Materializing param=text_model.encoder.layers.11.self_attn.k_proj.weight]
Loading weights:  48%|####7     | 190/398 [00:00<00:00, 2544.44it/s, Materializing param=text_model.encoder.layers.11.self_attn.out_proj.bias]
Loading weights:  48%|####7     | 190/398 [00:00<00:00, 2540.11it/s, Materializing param=text_model.encoder.layers.11.self_attn.out_proj.bias]
Loading weights:  48%|####7     | 191/398 [00:00<00:00, 2546.19it/s, Materializing param=text_model.encoder.layers.11.self_attn.out_proj.weight]
Loading weights:  48%|####7     | 191/398 [00:00<00:00, 2540.14it/s, Materializing param=text_model.encoder.layers.11.self_attn.out_proj.weight]
Loading weights:  48%|####8     | 192/398 [00:00<00:00, 2546.75it/s, Materializing param=text_model.encoder.layers.11.self_attn.q_proj.bias]    
Loading weights:  48%|####8     | 192/398 [00:00<00:00, 2542.51it/s, Materializing param=text_model.encoder.layers.11.self_attn.q_proj.bias]
Loading weights:  48%|####8     | 193/398 [00:00<00:00, 2546.00it/s, Materializing param=text_model.encoder.layers.11.self_attn.q_proj.weight]
Loading weights:  48%|####8     | 193/398 [00:00<00:00, 2540.44it/s, Materializing param=text_model.encoder.layers.11.self_attn.q_proj.weight]
Loading weights:  49%|####8     | 194/398 [00:00<00:00, 2546.74it/s, Materializing param=text_model.encoder.layers.11.self_attn.v_proj.bias]  
Loading weights:  49%|####8     | 194/398 [00:00<00:00, 2542.64it/s, Materializing param=text_model.encoder.layers.11.self_attn.v_proj.bias]
Loading weights:  49%|####8     | 195/398 [00:00<00:00, 2550.21it/s, Materializing param=text_model.encoder.layers.11.self_attn.v_proj.weight]
Loading weights:  49%|####8     | 195/398 [00:00<00:00, 2546.02it/s, Materializing param=text_model.encoder.layers.11.self_attn.v_proj.weight]
Loading weights:  49%|####9     | 196/398 [00:00<00:00, 2553.54it/s, Materializing param=text_model.final_layer_norm.bias]                    
Loading weights:  49%|####9     | 196/398 [00:00<00:00, 2547.52it/s, Materializing param=text_model.final_layer_norm.bias]
Loading weights:  49%|####9     | 197/398 [00:00<00:00, 2542.71it/s, Materializing param=text_model.final_layer_norm.weight]
Loading weights:  49%|####9     | 197/398 [00:00<00:00, 2534.33it/s, Materializing param=text_model.final_layer_norm.weight]
Loading weights:  50%|####9     | 198/398 [00:00<00:00, 2533.33it/s, Materializing param=text_projection.weight]            
Loading weights:  50%|####9     | 198/398 [00:00<00:00, 2525.07it/s, Materializing param=text_projection.weight]
Loading weights:  50%|#####     | 199/398 [00:00<00:00, 2526.25it/s, Materializing param=vision_model.embeddings.class_embedding]
Loading weights:  50%|#####     | 199/398 [00:00<00:00, 2520.28it/s, Materializing param=vision_model.embeddings.class_embedding]
Loading weights:  50%|#####     | 200/398 [00:00<00:00, 2526.31it/s, Materializing param=vision_model.embeddings.patch_embedding.weight]
Loading weights:  50%|#####     | 200/398 [00:00<00:00, 2522.02it/s, Materializing param=vision_model.embeddings.patch_embedding.weight]
Loading weights:  51%|#####     | 201/398 [00:00<00:00, 2528.86it/s, Materializing param=vision_model.embeddings.position_embedding.weight]
Loading weights:  51%|#####     | 201/398 [00:00<00:00, 2525.07it/s, Materializing param=vision_model.embeddings.position_embedding.weight]
Loading weights:  51%|#####     | 202/398 [00:00<00:00, 2532.24it/s, Materializing param=vision_model.encoder.layers.0.layer_norm1.bias]   
Loading weights:  51%|#####     | 202/398 [00:00<00:00, 2526.95it/s, Materializing param=vision_model.encoder.layers.0.layer_norm1.bias]
Loading weights:  51%|#####1    | 203/398 [00:00<00:00, 2531.72it/s, Materializing param=vision_model.encoder.layers.0.layer_norm1.weight]
Loading weights:  51%|#####1    | 203/398 [00:00<00:00, 2526.45it/s, Materializing param=vision_model.encoder.layers.0.layer_norm1.weight]
Loading weights:  51%|#####1    | 204/398 [00:00<00:00, 2532.67it/s, Materializing param=vision_model.encoder.layers.0.layer_norm2.bias]  
Loading weights:  51%|#####1    | 204/398 [00:00<00:00, 2528.50it/s, Materializing param=vision_model.encoder.layers.0.layer_norm2.bias]
Loading weights:  52%|#####1    | 205/398 [00:00<00:00, 2535.66it/s, Materializing param=vision_model.encoder.layers.0.layer_norm2.weight]
Loading weights:  52%|#####1    | 205/398 [00:00<00:00, 2529.44it/s, Materializing param=vision_model.encoder.layers.0.layer_norm2.weight]
Loading weights:  52%|#####1    | 206/398 [00:00<00:00, 2533.66it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc1.bias]      
Loading weights:  52%|#####1    | 206/398 [00:00<00:00, 2527.81it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc1.bias]
Loading weights:  52%|#####2    | 207/398 [00:00<00:00, 2533.01it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc1.weight]
Loading weights:  52%|#####2    | 207/398 [00:00<00:00, 2528.01it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc1.weight]
Loading weights:  52%|#####2    | 208/398 [00:00<00:00, 2534.28it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc2.bias]  
Loading weights:  52%|#####2    | 208/398 [00:00<00:00, 2530.52it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc2.bias]
Loading weights:  53%|#####2    | 209/398 [00:00<00:00, 2537.33it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc2.weight]
Loading weights:  53%|#####2    | 209/398 [00:00<00:00, 2532.44it/s, Materializing param=vision_model.encoder.layers.0.mlp.fc2.weight]
Loading weights:  53%|#####2    | 210/398 [00:00<00:00, 2534.72it/s, Materializing param=vision_model.encoder.layers.0.self_attn.k_proj.bias]
Loading weights:  53%|#####2    | 210/398 [00:00<00:00, 2528.44it/s, Materializing param=vision_model.encoder.layers.0.self_attn.k_proj.bias]
Loading weights:  53%|#####3    | 211/398 [00:00<00:00, 2534.74it/s, Materializing param=vision_model.encoder.layers.0.self_attn.k_proj.weight]
Loading weights:  53%|#####3    | 211/398 [00:00<00:00, 2530.99it/s, Materializing param=vision_model.encoder.layers.0.self_attn.k_proj.weight]
Loading weights:  53%|#####3    | 212/398 [00:00<00:00, 2536.33it/s, Materializing param=vision_model.encoder.layers.0.self_attn.out_proj.bias]
Loading weights:  53%|#####3    | 212/398 [00:00<00:00, 2531.54it/s, Materializing param=vision_model.encoder.layers.0.self_attn.out_proj.bias]
Loading weights:  54%|#####3    | 213/398 [00:00<00:00, 2538.24it/s, Materializing param=vision_model.encoder.layers.0.self_attn.out_proj.weight]
Loading weights:  54%|#####3    | 213/398 [00:00<00:00, 2534.55it/s, Materializing param=vision_model.encoder.layers.0.self_attn.out_proj.weight]
Loading weights:  54%|#####3    | 214/398 [00:00<00:00, 2541.27it/s, Materializing param=vision_model.encoder.layers.0.self_attn.q_proj.bias]    
Loading weights:  54%|#####3    | 214/398 [00:00<00:00, 2537.72it/s, Materializing param=vision_model.encoder.layers.0.self_attn.q_proj.bias]
Loading weights:  54%|#####4    | 215/398 [00:00<00:00, 2543.44it/s, Materializing param=vision_model.encoder.layers.0.self_attn.q_proj.weight]
Loading weights:  54%|#####4    | 215/398 [00:00<00:00, 2538.64it/s, Materializing param=vision_model.encoder.layers.0.self_attn.q_proj.weight]
Loading weights:  54%|#####4    | 216/398 [00:00<00:00, 2544.48it/s, Materializing param=vision_model.encoder.layers.0.self_attn.v_proj.bias]  
Loading weights:  54%|#####4    | 216/398 [00:00<00:00, 2540.67it/s, Materializing param=vision_model.encoder.layers.0.self_attn.v_proj.bias]
Loading weights:  55%|#####4    | 217/398 [00:00<00:00, 2547.35it/s, Materializing param=vision_model.encoder.layers.0.self_attn.v_proj.weight]
Loading weights:  55%|#####4    | 217/398 [00:00<00:00, 2541.78it/s, Materializing param=vision_model.encoder.layers.0.self_attn.v_proj.weight]
Loading weights:  55%|#####4    | 218/398 [00:00<00:00, 2545.58it/s, Materializing param=vision_model.encoder.layers.1.layer_norm1.bias]       
Loading weights:  55%|#####4    | 218/398 [00:00<00:00, 2539.21it/s, Materializing param=vision_model.encoder.layers.1.layer_norm1.bias]
Loading weights:  55%|#####5    | 219/398 [00:00<00:00, 2543.40it/s, Materializing param=vision_model.encoder.layers.1.layer_norm1.weight]
Loading weights:  55%|#####5    | 219/398 [00:00<00:00, 2538.21it/s, Materializing param=vision_model.encoder.layers.1.layer_norm1.weight]
Loading weights:  55%|#####5    | 220/398 [00:00<00:00, 2543.99it/s, Materializing param=vision_model.encoder.layers.1.layer_norm2.bias]  
Loading weights:  55%|#####5    | 220/398 [00:00<00:00, 2539.36it/s, Materializing param=vision_model.encoder.layers.1.layer_norm2.bias]
Loading weights:  56%|#####5    | 221/398 [00:00<00:00, 2544.58it/s, Materializing param=vision_model.encoder.layers.1.layer_norm2.weight]
Loading weights:  56%|#####5    | 221/398 [00:00<00:00, 2539.50it/s, Materializing param=vision_model.encoder.layers.1.layer_norm2.weight]
Loading weights:  56%|#####5    | 222/398 [00:00<00:00, 2545.14it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc1.bias]      
Loading weights:  56%|#####5    | 222/398 [00:00<00:00, 2541.30it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc1.bias]
Loading weights:  56%|#####6    | 223/398 [00:00<00:00, 2547.70it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc1.weight]
Loading weights:  56%|#####6    | 223/398 [00:00<00:00, 2544.19it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc1.weight]
Loading weights:  56%|#####6    | 224/398 [00:00<00:00, 2550.58it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc2.bias]  
Loading weights:  56%|#####6    | 224/398 [00:00<00:00, 2546.78it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc2.bias]
Loading weights:  57%|#####6    | 225/398 [00:00<00:00, 2553.18it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc2.weight]
Loading weights:  57%|#####6    | 225/398 [00:00<00:00, 2549.60it/s, Materializing param=vision_model.encoder.layers.1.mlp.fc2.weight]
Loading weights:  57%|#####6    | 226/398 [00:00<00:00, 2556.35it/s, Materializing param=vision_model.encoder.layers.1.self_attn.k_proj.bias]
Loading weights:  57%|#####6    | 226/398 [00:00<00:00, 2552.97it/s, Materializing param=vision_model.encoder.layers.1.self_attn.k_proj.bias]
Loading weights:  57%|#####7    | 227/398 [00:00<00:00, 2559.09it/s, Materializing param=vision_model.encoder.layers.1.self_attn.k_proj.weight]
Loading weights:  57%|#####7    | 227/398 [00:00<00:00, 2553.20it/s, Materializing param=vision_model.encoder.layers.1.self_attn.k_proj.weight]
Loading weights:  57%|#####7    | 228/398 [00:00<00:00, 2557.67it/s, Materializing param=vision_model.encoder.layers.1.self_attn.out_proj.bias]
Loading weights:  57%|#####7    | 228/398 [00:00<00:00, 2552.82it/s, Materializing param=vision_model.encoder.layers.1.self_attn.out_proj.bias]
Loading weights:  58%|#####7    | 229/398 [00:00<00:00, 2557.48it/s, Materializing param=vision_model.encoder.layers.1.self_attn.out_proj.weight]
Loading weights:  58%|#####7    | 229/398 [00:00<00:00, 2550.86it/s, Materializing param=vision_model.encoder.layers.1.self_attn.out_proj.weight]
Loading weights:  58%|#####7    | 230/398 [00:00<00:00, 2556.03it/s, Materializing param=vision_model.encoder.layers.1.self_attn.q_proj.bias]    
Loading weights:  58%|#####7    | 230/398 [00:00<00:00, 2551.15it/s, Materializing param=vision_model.encoder.layers.1.self_attn.q_proj.bias]
Loading weights:  58%|#####8    | 231/398 [00:00<00:00, 2555.36it/s, Materializing param=vision_model.encoder.layers.1.self_attn.q_proj.weight]
Loading weights:  58%|#####8    | 231/398 [00:00<00:00, 2550.60it/s, Materializing param=vision_model.encoder.layers.1.self_attn.q_proj.weight]
Loading weights:  58%|#####8    | 232/398 [00:00<00:00, 2556.19it/s, Materializing param=vision_model.encoder.layers.1.self_attn.v_proj.bias]  
Loading weights:  58%|#####8    | 232/398 [00:00<00:00, 2552.60it/s, Materializing param=vision_model.encoder.layers.1.self_attn.v_proj.bias]
Loading weights:  59%|#####8    | 233/398 [00:00<00:00, 2558.13it/s, Materializing param=vision_model.encoder.layers.1.self_attn.v_proj.weight]
Loading weights:  59%|#####8    | 233/398 [00:00<00:00, 2553.23it/s, Materializing param=vision_model.encoder.layers.1.self_attn.v_proj.weight]
Loading weights:  59%|#####8    | 234/398 [00:00<00:00, 2557.45it/s, Materializing param=vision_model.encoder.layers.2.layer_norm1.bias]       
Loading weights:  59%|#####8    | 234/398 [00:00<00:00, 2553.17it/s, Materializing param=vision_model.encoder.layers.2.layer_norm1.bias]
Loading weights:  59%|#####9    | 235/398 [00:00<00:00, 2559.07it/s, Materializing param=vision_model.encoder.layers.2.layer_norm1.weight]
Loading weights:  59%|#####9    | 235/398 [00:00<00:00, 2555.78it/s, Materializing param=vision_model.encoder.layers.2.layer_norm1.weight]
Loading weights:  59%|#####9    | 236/398 [00:00<00:00, 2562.26it/s, Materializing param=vision_model.encoder.layers.2.layer_norm2.bias]  
Loading weights:  59%|#####9    | 236/398 [00:00<00:00, 2559.20it/s, Materializing param=vision_model.encoder.layers.2.layer_norm2.bias]
Loading weights:  60%|#####9    | 237/398 [00:00<00:00, 2565.83it/s, Materializing param=vision_model.encoder.layers.2.layer_norm2.weight]
Loading weights:  60%|#####9    | 237/398 [00:00<00:00, 2562.55it/s, Materializing param=vision_model.encoder.layers.2.layer_norm2.weight]
Loading weights:  60%|#####9    | 238/398 [00:00<00:00, 2567.19it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc1.bias]      
Loading weights:  60%|#####9    | 238/398 [00:00<00:00, 2561.43it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc1.bias]
Loading weights:  60%|######    | 239/398 [00:00<00:00, 2566.60it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc1.weight]
Loading weights:  60%|######    | 239/398 [00:00<00:00, 2562.61it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc1.weight]
Loading weights:  60%|######    | 240/398 [00:00<00:00, 2565.97it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc2.bias]  
Loading weights:  60%|######    | 240/398 [00:00<00:00, 2560.92it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc2.bias]
Loading weights:  61%|######    | 241/398 [00:00<00:00, 2559.94it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc2.weight]
Loading weights:  61%|######    | 241/398 [00:00<00:00, 2550.41it/s, Materializing param=vision_model.encoder.layers.2.mlp.fc2.weight]
Loading weights:  61%|######    | 242/398 [00:00<00:00, 2548.82it/s, Materializing param=vision_model.encoder.layers.2.self_attn.k_proj.bias]
Loading weights:  61%|######    | 242/398 [00:00<00:00, 2542.72it/s, Materializing param=vision_model.encoder.layers.2.self_attn.k_proj.bias]
Loading weights:  61%|######1   | 243/398 [00:00<00:00, 2545.43it/s, Materializing param=vision_model.encoder.layers.2.self_attn.k_proj.weight]
Loading weights:  61%|######1   | 243/398 [00:00<00:00, 2539.94it/s, Materializing param=vision_model.encoder.layers.2.self_attn.k_proj.weight]
Loading weights:  61%|######1   | 244/398 [00:00<00:00, 2544.16it/s, Materializing param=vision_model.encoder.layers.2.self_attn.out_proj.bias]
Loading weights:  61%|######1   | 244/398 [00:00<00:00, 2539.07it/s, Materializing param=vision_model.encoder.layers.2.self_attn.out_proj.bias]
Loading weights:  62%|######1   | 245/398 [00:00<00:00, 2542.64it/s, Materializing param=vision_model.encoder.layers.2.self_attn.out_proj.weight]
Loading weights:  62%|######1   | 245/398 [00:00<00:00, 2535.59it/s, Materializing param=vision_model.encoder.layers.2.self_attn.out_proj.weight]
Loading weights:  62%|######1   | 246/398 [00:00<00:00, 2539.39it/s, Materializing param=vision_model.encoder.layers.2.self_attn.q_proj.bias]    
Loading weights:  62%|######1   | 246/398 [00:00<00:00, 2534.24it/s, Materializing param=vision_model.encoder.layers.2.self_attn.q_proj.bias]
Loading weights:  62%|######2   | 247/398 [00:00<00:00, 2538.63it/s, Materializing param=vision_model.encoder.layers.2.self_attn.q_proj.weight]
Loading weights:  62%|######2   | 247/398 [00:00<00:00, 2534.86it/s, Materializing param=vision_model.encoder.layers.2.self_attn.q_proj.weight]
Loading weights:  62%|######2   | 248/398 [00:00<00:00, 2539.61it/s, Materializing param=vision_model.encoder.layers.2.self_attn.v_proj.bias]  
Loading weights:  62%|######2   | 248/398 [00:00<00:00, 2536.13it/s, Materializing param=vision_model.encoder.layers.2.self_attn.v_proj.bias]
Loading weights:  63%|######2   | 249/398 [00:00<00:00, 2541.64it/s, Materializing param=vision_model.encoder.layers.2.self_attn.v_proj.weight]
Loading weights:  63%|######2   | 249/398 [00:00<00:00, 2537.33it/s, Materializing param=vision_model.encoder.layers.2.self_attn.v_proj.weight]
Loading weights:  63%|######2   | 250/398 [00:00<00:00, 2542.04it/s, Materializing param=vision_model.encoder.layers.3.layer_norm1.bias]       
Loading weights:  63%|######2   | 250/398 [00:00<00:00, 2538.46it/s, Materializing param=vision_model.encoder.layers.3.layer_norm1.bias]
Loading weights:  63%|######3   | 251/398 [00:00<00:00, 2543.81it/s, Materializing param=vision_model.encoder.layers.3.layer_norm1.weight]
Loading weights:  63%|######3   | 251/398 [00:00<00:00, 2540.24it/s, Materializing param=vision_model.encoder.layers.3.layer_norm1.weight]
Loading weights:  63%|######3   | 252/398 [00:00<00:00, 2545.57it/s, Materializing param=vision_model.encoder.layers.3.layer_norm2.bias]  
Loading weights:  63%|######3   | 252/398 [00:00<00:00, 2542.43it/s, Materializing param=vision_model.encoder.layers.3.layer_norm2.bias]
Loading weights:  64%|######3   | 253/398 [00:00<00:00, 2548.15it/s, Materializing param=vision_model.encoder.layers.3.layer_norm2.weight]
Loading weights:  64%|######3   | 253/398 [00:00<00:00, 2545.08it/s, Materializing param=vision_model.encoder.layers.3.layer_norm2.weight]
Loading weights:  64%|######3   | 254/398 [00:00<00:00, 2550.82it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.bias]      
Loading weights:  64%|######3   | 254/398 [00:00<00:00, 2547.73it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.bias]
Loading weights:  64%|######4   | 255/398 [00:00<00:00, 2551.72it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.weight]
Loading weights:  64%|######4   | 255/398 [00:00<00:00, 2546.88it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.weight]
Loading weights:  64%|######4   | 256/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc1.weight]
Loading weights:  64%|######4   | 256/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc2.bias]  
Loading weights:  64%|######4   | 256/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc2.bias]
Loading weights:  65%|######4   | 257/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc2.weight]
Loading weights:  65%|######4   | 257/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.mlp.fc2.weight]
Loading weights:  65%|######4   | 258/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.k_proj.bias]
Loading weights:  65%|######4   | 258/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.k_proj.bias]
Loading weights:  65%|######5   | 259/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.k_proj.weight]
Loading weights:  65%|######5   | 259/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.k_proj.weight]
Loading weights:  65%|######5   | 260/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.out_proj.bias]
Loading weights:  65%|######5   | 260/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.out_proj.bias]
Loading weights:  66%|######5   | 261/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.out_proj.weight]
Loading weights:  66%|######5   | 261/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.out_proj.weight]
Loading weights:  66%|######5   | 262/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.q_proj.bias]    
Loading weights:  66%|######5   | 262/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.q_proj.bias]
Loading weights:  66%|######6   | 263/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.q_proj.weight]
Loading weights:  66%|######6   | 263/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.q_proj.weight]
Loading weights:  66%|######6   | 264/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.v_proj.bias]  
Loading weights:  66%|######6   | 264/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.v_proj.bias]
Loading weights:  67%|######6   | 265/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.v_proj.weight]
Loading weights:  67%|######6   | 265/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.3.self_attn.v_proj.weight]
Loading weights:  67%|######6   | 266/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm1.bias]       
Loading weights:  67%|######6   | 266/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm1.bias]
Loading weights:  67%|######7   | 267/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm1.weight]
Loading weights:  67%|######7   | 267/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm1.weight]
Loading weights:  67%|######7   | 268/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm2.bias]  
Loading weights:  67%|######7   | 268/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm2.bias]
Loading weights:  68%|######7   | 269/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm2.weight]
Loading weights:  68%|######7   | 269/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.layer_norm2.weight]
Loading weights:  68%|######7   | 270/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc1.bias]      
Loading weights:  68%|######7   | 270/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc1.bias]
Loading weights:  68%|######8   | 271/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc1.weight]
Loading weights:  68%|######8   | 271/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc1.weight]
Loading weights:  68%|######8   | 272/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc2.bias]  
Loading weights:  68%|######8   | 272/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc2.bias]
Loading weights:  69%|######8   | 273/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc2.weight]
Loading weights:  69%|######8   | 273/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.mlp.fc2.weight]
Loading weights:  69%|######8   | 274/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.k_proj.bias]
Loading weights:  69%|######8   | 274/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.k_proj.bias]
Loading weights:  69%|######9   | 275/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.k_proj.weight]
Loading weights:  69%|######9   | 275/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.k_proj.weight]
Loading weights:  69%|######9   | 276/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.out_proj.bias]
Loading weights:  69%|######9   | 276/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.out_proj.bias]
Loading weights:  70%|######9   | 277/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.out_proj.weight]
Loading weights:  70%|######9   | 277/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.out_proj.weight]
Loading weights:  70%|######9   | 278/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.q_proj.bias]    
Loading weights:  70%|######9   | 278/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.q_proj.bias]
Loading weights:  70%|#######   | 279/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.q_proj.weight]
Loading weights:  70%|#######   | 279/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.q_proj.weight]
Loading weights:  70%|#######   | 280/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.v_proj.bias]  
Loading weights:  70%|#######   | 280/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.v_proj.bias]
Loading weights:  71%|#######   | 281/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.v_proj.weight]
Loading weights:  71%|#######   | 281/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.4.self_attn.v_proj.weight]
Loading weights:  71%|#######   | 282/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm1.bias]       
Loading weights:  71%|#######   | 282/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm1.bias]
Loading weights:  71%|#######1  | 283/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm1.weight]
Loading weights:  71%|#######1  | 283/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm1.weight]
Loading weights:  71%|#######1  | 284/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm2.bias]  
Loading weights:  71%|#######1  | 284/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm2.bias]
Loading weights:  72%|#######1  | 285/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm2.weight]
Loading weights:  72%|#######1  | 285/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.layer_norm2.weight]
Loading weights:  72%|#######1  | 286/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc1.bias]      
Loading weights:  72%|#######1  | 286/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc1.bias]
Loading weights:  72%|#######2  | 287/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc1.weight]
Loading weights:  72%|#######2  | 287/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc1.weight]
Loading weights:  72%|#######2  | 288/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc2.bias]  
Loading weights:  72%|#######2  | 288/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc2.bias]
Loading weights:  73%|#######2  | 289/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc2.weight]
Loading weights:  73%|#######2  | 289/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.mlp.fc2.weight]
Loading weights:  73%|#######2  | 290/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.k_proj.bias]
Loading weights:  73%|#######2  | 290/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.k_proj.bias]
Loading weights:  73%|#######3  | 291/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.k_proj.weight]
Loading weights:  73%|#######3  | 291/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.k_proj.weight]
Loading weights:  73%|#######3  | 292/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.out_proj.bias]
Loading weights:  73%|#######3  | 292/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.out_proj.bias]
Loading weights:  74%|#######3  | 293/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.out_proj.weight]
Loading weights:  74%|#######3  | 293/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.out_proj.weight]
Loading weights:  74%|#######3  | 294/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.q_proj.bias]    
Loading weights:  74%|#######3  | 294/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.q_proj.bias]
Loading weights:  74%|#######4  | 295/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.q_proj.weight]
Loading weights:  74%|#######4  | 295/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.q_proj.weight]
Loading weights:  74%|#######4  | 296/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.v_proj.bias]  
Loading weights:  74%|#######4  | 296/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.v_proj.bias]
Loading weights:  75%|#######4  | 297/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.v_proj.weight]
Loading weights:  75%|#######4  | 297/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.5.self_attn.v_proj.weight]
Loading weights:  75%|#######4  | 298/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm1.bias]       
Loading weights:  75%|#######4  | 298/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm1.bias]
Loading weights:  75%|#######5  | 299/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm1.weight]
Loading weights:  75%|#######5  | 299/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm1.weight]
Loading weights:  75%|#######5  | 300/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm2.bias]  
Loading weights:  75%|#######5  | 300/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm2.bias]
Loading weights:  76%|#######5  | 301/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm2.weight]
Loading weights:  76%|#######5  | 301/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.layer_norm2.weight]
Loading weights:  76%|#######5  | 302/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc1.bias]      
Loading weights:  76%|#######5  | 302/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc1.bias]
Loading weights:  76%|#######6  | 303/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc1.weight]
Loading weights:  76%|#######6  | 303/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc1.weight]
Loading weights:  76%|#######6  | 304/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc2.bias]  
Loading weights:  76%|#######6  | 304/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc2.bias]
Loading weights:  77%|#######6  | 305/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc2.weight]
Loading weights:  77%|#######6  | 305/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.mlp.fc2.weight]
Loading weights:  77%|#######6  | 306/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.k_proj.bias]
Loading weights:  77%|#######6  | 306/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.k_proj.bias]
Loading weights:  77%|#######7  | 307/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.k_proj.weight]
Loading weights:  77%|#######7  | 307/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.k_proj.weight]
Loading weights:  77%|#######7  | 308/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.out_proj.bias]
Loading weights:  77%|#######7  | 308/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.out_proj.bias]
Loading weights:  78%|#######7  | 309/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.out_proj.weight]
Loading weights:  78%|#######7  | 309/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.out_proj.weight]
Loading weights:  78%|#######7  | 310/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.q_proj.bias]    
Loading weights:  78%|#######7  | 310/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.q_proj.bias]
Loading weights:  78%|#######8  | 311/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.q_proj.weight]
Loading weights:  78%|#######8  | 311/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.q_proj.weight]
Loading weights:  78%|#######8  | 312/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.v_proj.bias]  
Loading weights:  78%|#######8  | 312/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.v_proj.bias]
Loading weights:  79%|#######8  | 313/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.v_proj.weight]
Loading weights:  79%|#######8  | 313/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.6.self_attn.v_proj.weight]
Loading weights:  79%|#######8  | 314/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm1.bias]       
Loading weights:  79%|#######8  | 314/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm1.bias]
Loading weights:  79%|#######9  | 315/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm1.weight]
Loading weights:  79%|#######9  | 315/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm1.weight]
Loading weights:  79%|#######9  | 316/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm2.bias]  
Loading weights:  79%|#######9  | 316/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm2.bias]
Loading weights:  80%|#######9  | 317/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm2.weight]
Loading weights:  80%|#######9  | 317/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.layer_norm2.weight]
Loading weights:  80%|#######9  | 318/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc1.bias]      
Loading weights:  80%|#######9  | 318/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc1.bias]
Loading weights:  80%|########  | 319/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc1.weight]
Loading weights:  80%|########  | 319/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc1.weight]
Loading weights:  80%|########  | 320/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc2.bias]  
Loading weights:  80%|########  | 320/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc2.bias]
Loading weights:  81%|########  | 321/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc2.weight]
Loading weights:  81%|########  | 321/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.mlp.fc2.weight]
Loading weights:  81%|########  | 322/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.k_proj.bias]
Loading weights:  81%|########  | 322/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.k_proj.bias]
Loading weights:  81%|########1 | 323/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.k_proj.weight]
Loading weights:  81%|########1 | 323/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.k_proj.weight]
Loading weights:  81%|########1 | 324/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.out_proj.bias]
Loading weights:  81%|########1 | 324/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.out_proj.bias]
Loading weights:  82%|########1 | 325/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.out_proj.weight]
Loading weights:  82%|########1 | 325/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.out_proj.weight]
Loading weights:  82%|########1 | 326/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.q_proj.bias]    
Loading weights:  82%|########1 | 326/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.q_proj.bias]
Loading weights:  82%|########2 | 327/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.q_proj.weight]
Loading weights:  82%|########2 | 327/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.q_proj.weight]
Loading weights:  82%|########2 | 328/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.v_proj.bias]  
Loading weights:  82%|########2 | 328/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.v_proj.bias]
Loading weights:  83%|########2 | 329/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.v_proj.weight]
Loading weights:  83%|########2 | 329/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.7.self_attn.v_proj.weight]
Loading weights:  83%|########2 | 330/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm1.bias]       
Loading weights:  83%|########2 | 330/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm1.bias]
Loading weights:  83%|########3 | 331/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm1.weight]
Loading weights:  83%|########3 | 331/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm1.weight]
Loading weights:  83%|########3 | 332/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm2.bias]  
Loading weights:  83%|########3 | 332/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm2.bias]
Loading weights:  84%|########3 | 333/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm2.weight]
Loading weights:  84%|########3 | 333/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.layer_norm2.weight]
Loading weights:  84%|########3 | 334/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc1.bias]      
Loading weights:  84%|########3 | 334/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc1.bias]
Loading weights:  84%|########4 | 335/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc1.weight]
Loading weights:  84%|########4 | 335/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc1.weight]
Loading weights:  84%|########4 | 336/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc2.bias]  
Loading weights:  84%|########4 | 336/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc2.bias]
Loading weights:  85%|########4 | 337/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc2.weight]
Loading weights:  85%|########4 | 337/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.mlp.fc2.weight]
Loading weights:  85%|########4 | 338/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.k_proj.bias]
Loading weights:  85%|########4 | 338/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.k_proj.bias]
Loading weights:  85%|########5 | 339/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.k_proj.weight]
Loading weights:  85%|########5 | 339/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.k_proj.weight]
Loading weights:  85%|########5 | 340/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.out_proj.bias]
Loading weights:  85%|########5 | 340/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.out_proj.bias]
Loading weights:  86%|########5 | 341/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.out_proj.weight]
Loading weights:  86%|########5 | 341/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.out_proj.weight]
Loading weights:  86%|########5 | 342/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.q_proj.bias]    
Loading weights:  86%|########5 | 342/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.q_proj.bias]
Loading weights:  86%|########6 | 343/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.q_proj.weight]
Loading weights:  86%|########6 | 343/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.q_proj.weight]
Loading weights:  86%|########6 | 344/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.v_proj.bias]  
Loading weights:  86%|########6 | 344/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.v_proj.bias]
Loading weights:  87%|########6 | 345/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.v_proj.weight]
Loading weights:  87%|########6 | 345/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.8.self_attn.v_proj.weight]
Loading weights:  87%|########6 | 346/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm1.bias]       
Loading weights:  87%|########6 | 346/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm1.bias]
Loading weights:  87%|########7 | 347/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm1.weight]
Loading weights:  87%|########7 | 347/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm1.weight]
Loading weights:  87%|########7 | 348/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm2.bias]  
Loading weights:  87%|########7 | 348/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm2.bias]
Loading weights:  88%|########7 | 349/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm2.weight]
Loading weights:  88%|########7 | 349/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.layer_norm2.weight]
Loading weights:  88%|########7 | 350/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc1.bias]      
Loading weights:  88%|########7 | 350/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc1.bias]
Loading weights:  88%|########8 | 351/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc1.weight]
Loading weights:  88%|########8 | 351/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc1.weight]
Loading weights:  88%|########8 | 352/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc2.bias]  
Loading weights:  88%|########8 | 352/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc2.bias]
Loading weights:  89%|########8 | 353/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc2.weight]
Loading weights:  89%|########8 | 353/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.mlp.fc2.weight]
Loading weights:  89%|########8 | 354/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.k_proj.bias]
Loading weights:  89%|########8 | 354/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.k_proj.bias]
Loading weights:  89%|########9 | 355/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.k_proj.weight]
Loading weights:  89%|########9 | 355/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.k_proj.weight]
Loading weights:  89%|########9 | 356/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.out_proj.bias]
Loading weights:  89%|########9 | 356/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.out_proj.bias]
Loading weights:  90%|########9 | 357/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.out_proj.weight]
Loading weights:  90%|########9 | 357/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.out_proj.weight]
Loading weights:  90%|########9 | 358/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.q_proj.bias]    
Loading weights:  90%|########9 | 358/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.q_proj.bias]
Loading weights:  90%|######### | 359/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.q_proj.weight]
Loading weights:  90%|######### | 359/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.q_proj.weight]
Loading weights:  90%|######### | 360/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.v_proj.bias]  
Loading weights:  90%|######### | 360/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.v_proj.bias]
Loading weights:  91%|######### | 361/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.v_proj.weight]
Loading weights:  91%|######### | 361/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.9.self_attn.v_proj.weight]
Loading weights:  91%|######### | 362/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm1.bias]      
Loading weights:  91%|######### | 362/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm1.bias]
Loading weights:  91%|#########1| 363/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm1.weight]
Loading weights:  91%|#########1| 363/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm1.weight]
Loading weights:  91%|#########1| 364/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm2.bias]  
Loading weights:  91%|#########1| 364/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm2.bias]
Loading weights:  92%|#########1| 365/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm2.weight]
Loading weights:  92%|#########1| 365/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.layer_norm2.weight]
Loading weights:  92%|#########1| 366/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc1.bias]      
Loading weights:  92%|#########1| 366/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc1.bias]
Loading weights:  92%|#########2| 367/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc1.weight]
Loading weights:  92%|#########2| 367/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc1.weight]
Loading weights:  92%|#########2| 368/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc2.bias]  
Loading weights:  92%|#########2| 368/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc2.bias]
Loading weights:  93%|#########2| 369/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc2.weight]
Loading weights:  93%|#########2| 369/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.mlp.fc2.weight]
Loading weights:  93%|#########2| 370/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.k_proj.bias]
Loading weights:  93%|#########2| 370/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.k_proj.bias]
Loading weights:  93%|#########3| 371/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.k_proj.weight]
Loading weights:  93%|#########3| 371/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.k_proj.weight]
Loading weights:  93%|#########3| 372/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.out_proj.bias]
Loading weights:  93%|#########3| 372/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.out_proj.bias]
Loading weights:  94%|#########3| 373/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.out_proj.weight]
Loading weights:  94%|#########3| 373/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.out_proj.weight]
Loading weights:  94%|#########3| 374/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.q_proj.bias]    
Loading weights:  94%|#########3| 374/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.q_proj.bias]
Loading weights:  94%|#########4| 375/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.q_proj.weight]
Loading weights:  94%|#########4| 375/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.q_proj.weight]
Loading weights:  94%|#########4| 376/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.v_proj.bias]  
Loading weights:  94%|#########4| 376/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.v_proj.bias]
Loading weights:  95%|#########4| 377/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.v_proj.weight]
Loading weights:  95%|#########4| 377/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.10.self_attn.v_proj.weight]
Loading weights:  95%|#########4| 378/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm1.bias]       
Loading weights:  95%|#########4| 378/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm1.bias]
Loading weights:  95%|#########5| 379/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm1.weight]
Loading weights:  95%|#########5| 379/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm1.weight]
Loading weights:  95%|#########5| 380/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm2.bias]  
Loading weights:  95%|#########5| 380/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm2.bias]
Loading weights:  96%|#########5| 381/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm2.weight]
Loading weights:  96%|#########5| 381/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.layer_norm2.weight]
Loading weights:  96%|#########5| 382/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc1.bias]      
Loading weights:  96%|#########5| 382/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc1.bias]
Loading weights:  96%|#########6| 383/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc1.weight]
Loading weights:  96%|#########6| 383/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc1.weight]
Loading weights:  96%|#########6| 384/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc2.bias]  
Loading weights:  96%|#########6| 384/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc2.bias]
Loading weights:  97%|#########6| 385/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc2.weight]
Loading weights:  97%|#########6| 385/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.mlp.fc2.weight]
Loading weights:  97%|#########6| 386/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.k_proj.bias]
Loading weights:  97%|#########6| 386/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.k_proj.bias]
Loading weights:  97%|#########7| 387/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.k_proj.weight]
Loading weights:  97%|#########7| 387/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.k_proj.weight]
Loading weights:  97%|#########7| 388/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.out_proj.bias]
Loading weights:  97%|#########7| 388/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.out_proj.bias]
Loading weights:  98%|#########7| 389/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.out_proj.weight]
Loading weights:  98%|#########7| 389/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.out_proj.weight]
Loading weights:  98%|#########7| 390/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.q_proj.bias]    
Loading weights:  98%|#########7| 390/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.q_proj.bias]
Loading weights:  98%|#########8| 391/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.q_proj.weight]
Loading weights:  98%|#########8| 391/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.q_proj.weight]
Loading weights:  98%|#########8| 392/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.v_proj.bias]  
Loading weights:  98%|#########8| 392/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.v_proj.bias]
Loading weights:  99%|#########8| 393/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.v_proj.weight]
Loading weights:  99%|#########8| 393/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.encoder.layers.11.self_attn.v_proj.weight]
Loading weights:  99%|#########8| 394/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.post_layernorm.bias]                      
Loading weights:  99%|#########8| 394/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.post_layernorm.bias]
Loading weights:  99%|#########9| 395/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.post_layernorm.weight]
Loading weights:  99%|#########9| 395/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.post_layernorm.weight]
Loading weights:  99%|#########9| 396/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.pre_layrnorm.bias]    
Loading weights:  99%|#########9| 396/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.pre_layrnorm.bias]
Loading weights: 100%|#########9| 397/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.pre_layrnorm.weight]
Loading weights: 100%|#########9| 397/398 [00:00<00:00, 2550.96it/s, Materializing param=vision_model.pre_layrnorm.weight]
Loading weights: 100%|##########| 398/398 [00:00<00:00, 2550.96it/s, Materializing param=visual_projection.weight]        
Loading weights: 100%|##########| 398/398 [00:00<00:00, 2550.96it/s, Materializing param=visual_projection.weight]
Loading weights: 100%|##########| 398/398 [00:00<00:00, 2569.82it/s, Materializing param=visual_projection.weight]
CLIPModel LOAD REPORT from: openai/clip-vit-base-patch32
Key                                  | Status     |  | 
-------------------------------------+------------+--+-
vision_model.embeddings.position_ids | UNEXPECTED |  | 
text_model.embeddings.position_ids   | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
The image processor of type `CLIPImageProcessor` is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with `use_fast=False`. 
INFO:     Started server process [47308]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit)