Upload folder using huggingface_hub

Files changed (4) hide show

README.md ADDED Viewed

+---
+license: gpl-3.0
+---
+In this work, we introduce Micro-World, an action-controlled interactive world model designed to generate high-quality, open-domain scenes. Built on top of the Wan2.1 family of models, we train both image-to-video (I2V) and text-to-video (T2V) variants to support a wide range of use cases. To foster open research and practical adoption in the community, we release the model weights, full training and inference code, as well as a curated dataset specifically tailored for controllable world modeling.
+For action injection, we favor adaLN for its lightweight parameter footprint, and ControlNet for its strong empirical stability during training.
+Note that released t2v model is trained using ControlNet architecture.
+More info please refer to code.
+<div style="margin: 0; padding: 0; text-align: center;">
+  <img src="https://github.com/user-attachments/assets/680b87ac-0c95-4a27-b4fd-fcafb9fdf609" alt="t2v architecture" title="t2v architecture" class="t2v architecture">
+  <img src="https://github.com/user-attachments/assets/c9cd8d9e-9555-42d3-b884-04705d1e329c" alt="t2v architecture" title="t2v architecture" class="t2v architecture">
+</div>

lora_diffusion_pytorch_model.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:a395da7ef7776bd567efcc4b85aa4918a577c1399bf2a6967d981acf50ab04e2
+size 356705124

transformer/config.json ADDED Viewed

+{
+    "_class_name": "WanActionControlNetModel",
+    "_diffusers_version": "0.34.0",
+    "cross_attn_norm": true,
+    "dim": 1536,
+    "eps": 1e-06,
+    "ffn_dim": 8960,
+    "freq_dim": 256,
+    "in_dim": 16,
+    "keyboard_dim": 7,
+    "model_type": "t2v",
+    "mouse_dim": 2,
+    "num_heads": 12,
+    "num_layers": 30,
+    "out_dim": 16,
+    "patch_size": [
+      1,
+      2,
+      2
+    ],
+    "qk_norm": true,
+    "text_dim": 4096,
+    "text_len": 512,
+    "action_in_dim": null,
+    "action_layers": null,
+    "window_size": [
+      -1,
+      -1
+    ]
+  }

transformer/diffusion_pytorch_model.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:46144735a50095b54751327934552cde9a3e091a624b19d29f12a1389bc4e2e7
+size 4315208944