yitongl commited on
Commit
9d27bcd
·
verified ·
1 Parent(s): 99026aa

Upload sfp4 sparse09 ours-p checkpoint-750 transformer

Browse files
README.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # sfp4_v4_sparse09_hpo_on_ours_p_init2050 checkpoint-750
2
+
3
+ This upload contains the consolidated WanTransformer3DModel transformer weights
4
+ from:
5
+
6
+ `checkpoints/sfp4_v4_sparse09_hpo_on_ours_p_init2050_1n_interactive/checkpoint-750`
7
+
8
+ Contents:
9
+
10
+ - `transformer/config.json`
11
+ - `transformer/diffusion_pytorch_model.safetensors`
12
+
13
+ Training run:
14
+
15
+ - run name: `sfp4_v4_sparse09_hpo_on_ours_p_init2050_1n_interactive`
16
+ - source init: `sfp4_v4_sparse06_hpo_on_ours_p_1n_interactive_v2 checkpoint-2050`
17
+ - attention backend: `SPARSE_FP4_OURS_P_ATTN`
18
+ - high precision output for backward: enabled
19
+ - VSA sparsity: `0.9`
20
+
21
+ This package does not include the distributed optimizer/training-state
22
+ checkpoint. Use the original `distributed_checkpoint/` directory if exact
23
+ training resume state is required.
transformer/config.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "WanTransformer3DModel",
3
+ "added_kv_proj_dim": null,
4
+ "attention_head_dim": 128,
5
+ "cross_attn_norm": true,
6
+ "eps": 1e-06,
7
+ "ffn_dim": 8960,
8
+ "freq_dim": 256,
9
+ "image_dim": null,
10
+ "in_channels": 16,
11
+ "num_attention_heads": 12,
12
+ "num_layers": 30,
13
+ "out_channels": 16,
14
+ "patch_size": [
15
+ 1,
16
+ 2,
17
+ 2
18
+ ],
19
+ "qk_norm": "rms_norm_across_heads",
20
+ "rope_max_seq_len": 1024,
21
+ "text_dim": 4096
22
+ }
transformer/diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:18b2922bb6e0480753e63da2488b6dd1f68cd23e6e2257e4007295d3a2ea5e0a
3
+ size 5676070784