| Using port 37124 |
| Starting rank=0, seed=0, world_size=1. |
| CDiT( |
| (x_embedder): PatchEmbed( |
| (proj): Conv2d(4, 1152, kernel_size=(2, 2), stride=(2, 2)) |
| (norm): Identity() |
| ) |
| (t_embedder): TimestepEmbedder( |
| (mlp): Sequential( |
| (0): Linear(in_features=256, out_features=1152, bias=True) |
| (1): SiLU() |
| (2): Linear(in_features=1152, out_features=1152, bias=True) |
| ) |
| ) |
| (y_embedder): ActionEmbedder( |
| (x_emb): TimestepEmbedder( |
| (mlp): Sequential( |
| (0): Linear(in_features=256, out_features=384, bias=True) |
| (1): SiLU() |
| (2): Linear(in_features=384, out_features=384, bias=True) |
| ) |
| ) |
| (y_emb): TimestepEmbedder( |
| (mlp): Sequential( |
| (0): Linear(in_features=256, out_features=384, bias=True) |
| (1): SiLU() |
| (2): Linear(in_features=384, out_features=384, bias=True) |
| ) |
| ) |
| (angle_emb): TimestepEmbedder( |
| (mlp): Sequential( |
| (0): Linear(in_features=256, out_features=384, bias=True) |
| (1): SiLU() |
| (2): Linear(in_features=384, out_features=384, bias=True) |
| ) |
| ) |
| ) |
| (blocks): ModuleList( |
| (0-27): 28 x CDiTBlock( |
| (norm1): LayerNorm((1152,), eps=1e-06, elementwise_affine=False) |
| (attn): Attention( |
| (qkv): Linear(in_features=1152, out_features=3456, bias=True) |
| (q_norm): Identity() |
| (k_norm): Identity() |
| (attn_drop): Dropout(p=0.0, inplace=False) |
| (proj): Linear(in_features=1152, out_features=1152, bias=True) |
| (proj_drop): Dropout(p=0.0, inplace=False) |
| ) |
| (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=False) |
| (norm_cond): LayerNorm((1152,), eps=1e-06, elementwise_affine=False) |
| (cttn): MultiheadAttention( |
| (out_proj): NonDynamicallyQuantizableLinear(in_features=1152, out_features=1152, bias=True) |
| ) |
| (adaLN_modulation): Sequential( |
| (0): SiLU() |
| (1): Linear(in_features=1152, out_features=12672, bias=True) |
| ) |
| (norm3): LayerNorm((1152,), eps=1e-06, elementwise_affine=False) |
| (mlp): Mlp( |
| (fc1): Linear(in_features=1152, out_features=4608, bias=True) |
| (act): GELU(approximate='tanh') |
| (drop1): Dropout(p=0, inplace=False) |
| (norm): Identity() |
| (fc2): Linear(in_features=4608, out_features=1152, bias=True) |
| (drop2): Dropout(p=0, inplace=False) |
| ) |
| ) |
| ) |
| (final_layer): FinalLayer( |
| (norm_final): LayerNorm((1152,), eps=1e-06, elementwise_affine=False) |
| (linear): Linear(in_features=1152, out_features=32, bias=True) |
| (adaLN_modulation): Sequential( |
| (0): SiLU() |
| (1): Linear(in_features=1152, out_features=2304, bias=True) |
| ) |
| ) |
| (time_embedder): TimestepEmbedder( |
| (mlp): Sequential( |
| (0): Linear(in_features=256, out_features=1152, bias=True) |
| (1): SiLU() |
| (2): Linear(in_features=1152, out_features=1152, bias=True) |
| ) |
| ) |
| ) |
| Searching for model from logs/cdit_debug/checkpoints |
| ****** Evaluating from NON PREDEFINED index... ****** |
| Dataset: wuhan (train), size: 18652 |
| ****** Evaluating from NON PREDEFINED index... ****** |
| Dataset: wuhan (test), size: 4502 |
| ****** Evaluating from NON PREDEFINED index... ****** |
| Dataset: wuhan_auto (train), size: 17706 |
| ****** Evaluating from NON PREDEFINED index... ****** |
| Dataset: wuhan_auto (test), size: 4235 |
| Combining 2 datasets. |
|
|