Upload folder using huggingface_hub

Files changed (8) hide show

README.md ADDED Viewed

+---
+license: gpl-3.0
+---
+In this work, we introduce Micro-World, an action-controlled interactive world model designed to generate high-quality, open-domain scenes. Built on top of the Wan2.1 family of models, we train both image-to-video (I2V) and text-to-video (T2V) variants to support a wide range of use cases. To foster open research and practical adoption in the community, we release the model weights, full training and inference code, as well as a curated dataset specifically tailored for controllable world modeling.
+For action injection, we favor adaLN for its lightweight parameter footprint, and ControlNet for its strong empirical stability during training.
+Note that released I2V model is trained using AdaLN architecture.
+More info please refer to code.
+<div style="margin: 0; padding: 0; text-align: center;">
+  <img src="https://github.com/user-attachments/assets/680b87ac-0c95-4a27-b4fd-fcafb9fdf609" alt="t2v architecture" title="t2v architecture" class="t2v architecture">
+  <img src="https://github.com/user-attachments/assets/c9cd8d9e-9555-42d3-b884-04705d1e329c" alt="t2v architecture" title="t2v architecture" class="t2v architecture">
+</div>

lora_diffusion_pytorch_model.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:20d52fc5ef9171a1aaa7f5560352772ebdc52906008bd95a71b60d10092c1213
+size 1458497816

transformer/config.json ADDED Viewed

+{
+  "_class_name": "WanActionAdaLNModel",
+  "_diffusers_version": "0.34.0",
+  "action_dim": 1536,
+  "cross_attn_norm": true,
+  "dim": 5120,
+  "eps": 1e-06,
+  "ffn_dim": 13824,
+  "freq_dim": 256,
+  "in_channels": 16,
+  "in_dim": 36,
+  "keyboard_dim": 7,
+  "model_type": "i2v",
+  "mouse_dim": 2,
+  "num_heads": 40,
+  "num_layers": 40,
+  "out_dim": 16,
+  "patch_size": [
+    1,
+    2,
+    2
+  ],
+  "qk_norm": true,
+  "text_dim": 4096,
+  "text_len": 512,
+  "window_size": [
+    -1,
+    -1
+  ]
+}

transformer/diffusion_pytorch_model-00001-of-00004.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:b93198ad45411face10c0d764b4c512883b859d56e205fd81319f3af05a0007f
+size 9957502392

transformer/diffusion_pytorch_model-00002-of-00004.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:18739d9ef54f21fb8500701b8100ea82c7783916c64712ff8ac08bfff07793c7
+size 9954400440

transformer/diffusion_pytorch_model-00003-of-00004.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:7be3524698564432f6af1f8bc2a4ab2af5121df495e9e2330d99f15acc0d7306
+size 9901951000

transformer/diffusion_pytorch_model-00004-of-00004.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:71d18792f93a9a45d3802c73cae60177d9e2bdeb95554fb8fa10b87979d32c6a
+size 6761695152

transformer/diffusion_pytorch_model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff